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ABSTRACT 

The Hawkins-Stafford Education Improvement Amendments 
of 1988 established the National Cooperative Education Statistics 
System and directed the Commissioner of the National Center for 
Education Statistics to support the development and implementation of 
standards for education data collectior , processing, analysis, and 
reporting. The Cooperative Education Data Collection and Reporting 
Standards Project was initiated and a task force was formed to plan, 
produce, review, and disseminate the standards, which are concerned 
with processes, rather than results. This document was developed to 
help improve the usefulness and timeliness of education data, but it 
does not: describe the types of data that should be collected. 
Standards for Education Data Collection and Reporting (SEDCAR) are 
developed and organized into the following phases of management of 
data collection and reporting, design, data collection, data 
preparation and processing, data analysis, and reporting and 
dissemination of data. Each standard has at least four components: 
(1) identification of the phase the standard falls under; (2) the 
subject (topic) of the standard; (3) a statement of purpose; and (4) 
guidelines to best practice. Some of the SEDCARs contain related 
standards information and checklists of procedural steps. 
Thirty-eight SEDCARs are outlined. Two appendices provide information 
about related standards, and an index and glossary are included. A 
14-item list of references is included. (SLD) 
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Preface 



The Hawkins-Stafford Education Improvement Amendments of 1988 (P.L. 100-297) 
established the National Cooperative Education Statistics System (Cooperative System), a 
joint program of the National Center for Education Statistics of the U.S. Department of 
Education and the states. The goal of the project is to improve the comparability, quality, 
and usefulness of data collected from states and other education entities on the condition of 
education in the nation. To help achieve this objective, the legislation directed the 
Commissioner of the National Center for Education Statistics to support the development 
and implementation of standards for education data collection, processing, analysis, and 
reporting. 

The Cooperative Education Data Collection and Reporting (CEDCAR) Standards 
Project was initiated to produce these standards through the combined efforts of data 
providers, producers, and users at the local, state, and federal levels. This document, 
Standards for Education Data Collection and Reporting (SEDCAR), is the product of that 
cooperative effort. 

A Task Force of data system professionals, drawn primarily from the membership of 
the National Forum on Education Statistics, assumed major responsil ility for planning, 
producing, reviewing, and disseminating the Standards. A Task Group of subject-area 
specialists assisted the Task Force in drafting the document. 

The CEDCAR Standards Project is a three-phase effort extending over a period of 
more than two years. During Phase 1 (July 1989-December 1989), the Task Force of local, 
state, and federal representatives began laying the foundation for the development of the 
Standards. The Task Force reviewed related standards, decided upon the uiost useful scope 
and format for the document, and created a plan for developing the Standards. 

In Phase 2 (January 1990-December 1990), Task Force and Task Group members 
developed an initial draft of the Standards for review by state representatives of the 
Cooperative System and federal agency staff. Task Force members also designed the field 
review plan and informed intended audiences about the progress of the project. 

Phase 3 (January 1991 -September 1991) encompassed peer and field review of the 
draft Standards at the state and local levels, review by specialists in statistics and education 
research, revision of the document based on reviewers' comments, submittal of the 
Standards for endorsement by the National Center for Education Statistics and the National 
Forum on Education Statistics, and dissemination to intended audiences. 

The entire planning, development, and review process of this project relied upon the 
active involvement of local, state, and federal members of the Cooperative System in an 



iterative process intended to bring about consensus on the Standards. This broad-based 
participation was deemed critical to the creation of Standards that would meet the dual 
goals of technical excellence and usefulness. 



1V r 

o 6 



ERIC 



Acknowledgments 



The Cooperative Education Data Collection and Reporting (CEDCAR) Standards Project is a 
collegial effort based upon the quality-improvement management principles of team work and 
cooperation a' all levels of the U.S. education system. Tht project succeeded so well in achieving the 
goal of broad professional involvement that it would be impossible to cite all of those who made 
substantive contributions to this work. Only a few can be acknowledged in these paragraphs. 

First, it is important to emphasize that the Task Force and Task Group members are the actual 
authors of the Standards for Education Data Collection and Reporting (SEDCAR). In addition to 
determining the format, content, and style oi the Standards, the Task Force and Task Group members 
drafted, reviewed, and revised the individual standards through many painstaking iterations. 

Second, it is important to point out that the Standards owe their existence to Emerson J. 
Elliott, Acting Commissioner for Education Statistics, National Center for Education Statistics (NCES), 
who envisioned the Standards as an essential tool for promoting comparability and uniformity of data 
collected and reported through the National Cooperative Education Statistics System. Paul Planchon, 
Associate Commissioner, NCES, has provided consistent guidance and support for the Standards as a 
Cooperative System project under his authority. Daniel Stufflebcam and James Sanders, both of 
Western Michigan University, shared the wisdom they have acquired through their extensive 
experience in developing professional sty ^ards in the field of education. Ramsay Selden and Barbara 
Clements of the Council of Chief State School Officers were equally generous in sharing their 
experience in working with state education agencies. The Westat team of Margaret Cahalan, Sheila 
Heaviside, Wendy Mansfield, and George Rush worked with dedication and efficiency that went well 
beyond the nine-to-five workday. Finally, John Kotler and Carol Litman contributed the editing that 
puUed it all together. 

David L. Bayless Lee M. Hoffman 

Westat, Inc. National Center for Education Statistics 

Task Force Chair Project Officer 

The Standards for Education Data Collection and Reporting were developed through the 
contributions of many individuals, identified in the pages that follow, 

TASK FORCE 

The following persons assumed the lead responsibility for developing specifications, drafting 
standards, and reviewing each revision as the Standards evolved: 



Chair 

David L. Bayless, Westat, Inc. 



Standards for Education Data Collection and Reporting 



Committee Members 

David Angus, School of Education, University of Michigan 

Janice Baker, Rhode Island Department of Education 

Joel Bloom, New Jersey Institute of Technology 

Lawrence Bussey, Office for Civil Rights, U.S. Department of Education 

Scott Bus well, Montana Office of Public Instruction 

Charles Carrick, Louisiana Department of Education 

Robert Friedman, Florida Department of Education 

Steve Gorman, National Center for Education Statistics 

Ronald Henderson, National Education Association 

Roger Hummel, Pennsylvania Department of Education 

Jo Ann Kerrey, South Carolina Department of Education 

Glynn Ligon, Austin (Texas) Independent School System 

Paul Sicgcl, United States Bureau of the Census 

Ray Turner, Dade County (Florida) Public Schools 

Elizabeth VandcrPuttcn, National Science Foundation 

Kathcrinc Wallman, Council of Professional Associations on Federal Statistics 



TASK GROUP 

The following persons contributed to the initial draft of the Standards and provided comments 
and suggestions for the improvement of each succeeding draft: 

David Boescl, Department of Defense Manpower Data Center 

Tcrrence Davidson, Livonia (Michigan) Public Schools 

Roy Forbes, Center for School Accountability, North Carolina 

John Fremcr, Educational Testing Service 

Lori Henkcnius, Nebraska State Department of Education 

Philip Kaufman, MPR Associates, California 

C. Philip Kearney, School of Education, University of Michigan 

Marjoric Mastic, Washtenaw (Michigan) Intermediate School District 

Carol Norris, Arizona Department of Education 

Chris Pipho, Education Commission of the States 

Suzanne Triplctt, North Carolina Department of Public Instruction 

James Watkins, Maine Department of Educational and Cultural Services 



PROJECT SPONSOR 

The following individuals from the National Center for Education Statistics provided guidance 
and oversight for the project: 

Emerson J. Elliott 
Lee M. Hoffman 
Paul Planchon 



vi 



9 

EMC 



b 



Acknowledgments 



PROJECT STAFF 

The following individuals from Westat coordinated the development of the Standards and 
provided staff support for the Task Force and Task Group efforts: 

David L. Bayless 
Margaret Cahalan 
Sheila Heaviside 
Wendy Mansfield 
George Rush 

CONSULTANTS AND SUBCONTRACTORS 

The following persons contributed their expertise during the development process: 

Barbara Clements, Council of Chief State School Officers 
James Sanders, Western Michigan University 
Ramsay Selden, Council of Chief State School Officers 
Daniel Stuftlebeam, Western Michigan University 



PEER REVIEWERS 

The following individuals reviewed and provided recommendations for the first draft of the 
Standards: 

Johr Adams, Data Recognition Corporation 

John Allen, Division of Special Education, Missouri Department of Education 
Nadir Atash, Westat, Inc. 

Sarah Beard, Assessment and Compliance, Mississippi Department of Education 

Bob Beechura, Information Services, Nebraska Department of Education 

Glenn C. Boerrigter, Office of Vocational and Adult Education, U.S. Department of Education 

John T. Brady, Florida Department of Education 

M. D. Brasel, New Mexico Department of Education 

J. Michael Brick, Westat, Inc. 

Fredrick Brigham, National Catholic Education Association 

William Brown, Research, Testing and Accreditation Services, North Carolina Department of 

Education 
Ken Burgdorf, Westat, Inc. 
John Burke, Westat, Inc. 

Tim Callahan, National Association of State Boards of Education 

Adam Chu, Westat, Inc. 

Mick Couper, Bureau of the Census 

Joe Creech, Southern Regional Education Board 

Kevin Crowe, Planning, Research and Evaluation, Nevada Department of Education 



ERIC 



vii 

9 



Standards for Education Data Collection and Reporting 



Robert Dutton, Finance and Administration, Idaho Department of Education 

Elizabeth Farris, Westat, Inc. 

Thomas Hagler, Bibb County (Georgia) Schools 

Jill Hanson, Washington School Information Processing Cooperative 

Ann Harrison, [ >gram Planning, Research and Evaluation, Kansas Department of Education 

Donald C. Holznagel, Computer Technology Program, Northwest Regional Educational Laboratory 

Mavis E. Kelley, S^i? Board and External Relations, Iowa State Department of Education 

C. Thomas Ktrins, l^vt-wnt Assessment, Illinois Department of Education 

Dick Kulka, Nation? * Opinion Research Corporation 

Charles Lenth, SHEEO/NCES Communications Network 

J. Vincent Madden, Information Systems, California Department of Education 

John McClure, Data Management, West Virginia Department of Education 

Joyce McCray, G)uncil for American Private Education 

Claudia Mcrkel-Keller, Division of Vocational Education, New Jersey Department of Education 
Paul Moore, Research Triangle Institute 

Sean Mulhearn, School Psychological Services, Wisconsin Department of Education 
Ken Olsen, University of Kentucky 

Ed Penry, Student Information Management, School District of Philadelphia 
John Pisapia, Virginia Commonwealth University 

Al Rasp, Assessment and Accreditation, Washington Department of Education 

Anne Raymond, Publication and Information, Georgia Department of Education 

Ed Roeber, Education Assessment, Michigan Department of Education 

Michael Rubin, Evaluation and Training Institute 

Stanley Rumbaugh, Brandeis University 

Thomas Saterfiel, Research, American College Testing 

Barbara Shay, Bureau of Occupational Education, New York State Education Department 

Arlcn Sheldrake, Management Information Systems, Multnomah (Oregon) Education Service District 

Caryn Shoemaker, Arizona Department of Education 

T.G. Smith, Child Nutrition Programs, Alabama Department of Education 

H. M. Snodgrass, Research and Planning, Kentucky Department of Education 

John J. Stiglmeier, Information Center on Education, New York State Education Department 

Marianne Strusinski, Educational Accountability, Dade County (Florida) Public Schools 

Ron Torgcson, Information and Research, North Dakota Department of Education 

Susan Tyson, State Assessment, Georgia Department of Education 

Joe Waksbcrg, Westat, Inc. 

Carole D. White, Research and Evaluation, Delaware Department of Education 
David Wiley, Northwestern University 

Trevor Williams, The Australian Council for Educational Research Limited and Consultant to Westat, 
Inc. 

Joan Wills, Institute for Education Leadership 



via 




10 



Acknowledgments 



PILOT APPLICATION PARTICIPANTS 

The following individuals participated in the Pilot Application of the Standards and contributed 
recommendations for improvement: 

West Palm Beach County (Florida) Public Schools 
Joe Abalos 
Marc Barron 
Sharon Brannon 
Gary Gramenz 

North Carolina State Department of Education 

John Bolton 
William Brown 
Michael Poteet 
Suzanne Triplett 

New York State Education Department 
Lynn Humiston 
Lcn Powell 
John Stiglmeier 

Council of Chief State School Officers 

Barbara Clements 

EDITORIAL CONSULTANTS 

Editorial assistance was provided by; 

John D. Kotler, Kotler Editorial Associates 
Carol Litman, Westat, Inc. 

WORD PROCESSING AND GRAPHIC DESIGN SUPPORT 
Westat 

Saunders Freeland 
Susan Robbins Hein 
Sylvie R. Warren 
Marguerite Winslow 

CLERICAL ASSISTANTS 

Clerical assistance was provided by: 

Westat 

Rita Buck 
Annie Golden 
Debbie Zimmerman 



ix 



Introduction 



The Standards for Education Data Collection and Reporting (SEDCAR) set forth 
principles that represent best practice in the collection, processing, analysis, and reporting of 
education statistics. For the purposes of this project, a standard is defined as a principle 
commonly agreed upon by those engaged in education data collection and reporting for guiding 
and assessing the quality of individual data collection and reporting activities. The Standards 
are concerned with processes, not results. 

This document was developed cooperatively, with contributions from a wide range of 
data providers, producers, and users, including advocates and professional researchers from 
local, state, and federal agencies and from the private sector. The Standards are intended to 
help improve the usefulness, timeliness, accuracy, and comparability of education data that 
inform key policy decisions at all levels of the U.S. education system, with the ultimate goal 
of improving education. Although the Standards were designed specifically for data that fall 
within the scope of the National Cooperative Education Statistics System, they are applicable 
to a broader range of education data collection and reporting activities. It is hoped that the 
Standards will be used by state and local education agencies, federal agencies, schools, research 
and professional organizations, academic researchers, and policymakers. Because they were 
developed initially for the National Cooperative Education Statistics System, however, they are 
most relevant to data collection activities within the Cooperative System. 

The Standards do not attempt to describe the types of data that should be collected. For 
example, they do not specify what indicators the National Cooperative Education Statistics 
System should collect. Rather, the Standards arc intended to serve as a guide to the key phases 
of data collection and reporting. They identify the qualities that characterize good measures and 
describe the process of selecting and evaluating appropriate measures that will result in data 
of the highest quality-data that provide useful, timely, accurate, and comparable information. 



Data Collection and Reporting Phases 

This document takes a comprehensive view of the processes that occur during each 
phase of data collection and reporting, once a data need has been identified. It guides the 
reader step by step through these processes, from the initial articulation of a data need through 
the fulfillment of the data requirement. The following six phases form the conceptual 
framework in which the Standards have been developed and organized: 

■ Management of Data Collection and Reporting 

■ Design 

■ Data Collection 

■ Data Preparation and Processing 

■ Data Analysis 

■ Reporting and Dissemination of Data 
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Within these phases, an attempt has been made to arrange individual standards in the order in 
which they would be performed in an actual data collection and reporting activity. Sometimes, 
though, the processes addressed in different standards may occur simultaneously. 

Although the Standards are divided into distinct phases, the phases are interrelated. 
Individuals working on one phase should be familiar with the standards for other phases. The 
standards that guide earlier phases of a data collection activity are still relevant during later 
phases. For example, data processing staff may find it necessary to refer to the data collection 
standards for guidance on nonresponse followup activities. Similarly, standards in later phases 
are relevant during earlier phases of a data collection activity. For example, the data analysis 
standards should be considered during the design phase. Furthermore, individuals participating 
in different phases should communicate with each other throughout the data collection and 
reporting activities to ensure coordination and integration of the various tasks. 



Guide to Readers 

Some phases and standards in this document are more technical than others. The level 
of technical specificity appropriate for each standard differs by subject matter. Readers are 
encouraged to seek out supplemental resources if more general or specific guidance is needed 
in a particular area. Furthermore, some of the standards do not apply uniformly to all data 
collection activities. The amount of attention paid to each guideline will often be 
commensurate with the size arid importance of the data collection activity. 




Introduction 



Standards and Checklists 

The Standards are composed primarily of standards for each major phase of data 
collection and reporting. Every standard contains a statement of purpose and a series of 
guidelines that describe the "best practice" for fulfilling the purpose of the standard. When 
appropriate, related standards and checklists are cited to provide additional guidance in an area 
addressed by one or more of the standards. 

Each of the major phases addressee by the Standards begins with an introduction that 
includes a discussion of the scope of the phase, the underlying assumptions, and the intended 
audiences. Limitations and potential problems are also discussed. 

Each standard has at least four components (two others are added when applicable) 
arranged in the following order (see sample standard format on page xiii): 

■ Phase - Identifies which of the six phases the standard falls under. 

■ Subject - Identifies the topic of the standard. Subjects are in chronological order 
within phases. 

■ Statement of Purpose - Provides the objective of the standard. 

■ Guidelines - Provides "best practice" procedures to be followed in order to 
achieve the objective identified in the statement of purpose. The guidelines are 
chronological steps within the standard. 

Some standards contain one or both of the following: 

■ Related Standards - Reference other standards within the document that readers 
may consider when applying a standard. 

■ Checklists - Specify procedural steps to follow to help achieve the purpose of 
the standard. These steps may expand upon an individual guideline, or they 
may further develop the entire standard. 



Numbering System 

Phases are identified by one-digit numbers from one to six. Standards within each 
phase are identified by two-digit numbers»the first identifying the phase, the second identifying 
its order within the phase. Guidelines have three-digit numbers-the first two identifying the 
phase and the standard, the third identifying their order within the standard. Checklists also 
have three-digit numbers that correspond to the most relevant guideline upon which they are 
expanding. 
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Related Information 



This document includes appendices, an index, references, and a glossary of technical 
terms to assist readers in using the Standards. 

■ Appendices - Contain tables identifying standards developed by other agencies 
and organizations that were used in the development of particular standards in 
this document and that may be referred to for additional information. 

■ Appendix A - Describes the seven sets of standards that were used as 
reference material throughout the six phases. 

■ Appendix B - Describes the six sets of standards that were referred to for 
particular subject matter covered by one or more phase. 

■ Index - Groups standards in different phases by key topic areas (e.g., 
management, training, documentation, confidentiality) for easy reference. 

■ References - List sources used in the development of some standards. These 
sources may provide more extensive guidance in particular topic areas. Some 
of the references provide definitions beyond those included in the glossary of 
this document. 



Glossary - Technical terminology has been avoided when alternative language 
could be used without a loss of accuracy . In some cases, however, technical 
terminology is necessary to signify the full meaning and intent of the Standards. 
These terms are defined in here. Terms covered by the glossary are printed in 
italics the first time they appear in the text of each standard. 



Terminology 

Throughout this document, specific terms are used to refer to key elements and 
participants in data collection and reporting activities. 

■ Data collection activities - The Standards use this phrase to encompass all 
phases of data collection and reporting. 

■ Data requestor - Agency or organization that requests or sponsors the data 
collection and reporting activity. 




SAMPLE STANDARD FORMAT 



Phase 



Subject 



3. DATA COLLECTION 



Purpose 

3-Digit 
Guideline 
Number 



Guideline 



PURPOSE; To ensure that data collection staff arc able to carry out the collection according 
to plan with a minimum ofinaocanatie*, ratnuioo, mid burden. 



Guidelines 



30^ 



Staff training should reflect the complexity of the project Foreompl«xdaUooaccti««cUviti«i.oi 
those that require the collector to deviate from standard questions, training should include, at a 
minimum, a thorough explanation of the study goals, guidelines for deviating from or expanding on 
standard questions, and methods for documenting the collection activity. (See Checklist for 
Training for Data Collectors.) 

Training should be designed based on the collection methodology to be used. For example, more 
extensive training is usually required for instruments with open-ended items. 




3 Staffing resources should be sufficient to ensure that replacement personnel arc available on an 
as needed basis. 



Related 
Standards 



• RELATED STANDARDS 

3 J. Standard for Ethical Traatment of Respondents 

34, Standard for Mmfanfamf, Burden and Nonresponse 



SAMPLE CHECKLIST FORMAT 



3. DATA COLLECTION 



Checklist 
(3-digit 
checklist 
number 
corresponds 
to guideline 
number) 



32. Standard for Selecting and Training Data Collection Staff 



-► 3.2.1. Cbedtlirt for Triinlnt for D*U CaU«tor« 



Upon couploian of training. <Uu collectors thould undentind iht following: 

1. All definitions uwd in thi ooUeaiao instrument 

2. The penons or records from whom/which the deu tie to be collected (e.g.. which twehat will be 
interviewed? whsi cltuet will uke the test? whu records will be examined?) 

3. The due. time, end duntian of the test or del* collection ictiviiy 

Wwvwywww 
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Data producer - Agency or organization that carries out the actual study design, 
data collection, processing, analysis, and reporting. Encompasses all members 
of the project staff including managers, data collectors, data processors, data 
analysts, and data reporters. 

(In some cases, the same agency or organization is both the data requestor and 
the data producer. In others, they are different entities.) 

Data provider - Agency, organization, or individual who supplies data. For 
example, in a national education survey, data providers might include state 
education agencies, local education agencies, school districts, schools, teachers, 
students, and parents. 

Respondent - This term is used particularly in the standards for the Data 
Collection Phase referring specifically to the individual or agency who 
completes a survey (e.g., the person who marks the answers on a survey 
instrument or who provides answers verbally to a data collector). 

Data users - Agencies, organizations, or individuals who use the data developed 
by the data producer. Although the term data user may refer to the data 
requestor-the entity that originally requested the data-it may also refer to other 
entities or individuals, including other agencies, individual researchers, the 
media, and members of the public, who utilize the results of a data collection 
activity in some way. 



Best Practice 

The Standards can be used to determine the differences between actual^rocedures used 
in a particular data collection and reporting activity and the recommended "best practices." 
The document also provides the material to stimulate a planned program of continuous 
professional growth. 

The Standards are not intended, however, to be used to measure compliance with 
externally imposed requirements. Therefore, it is the intent of the authors that the adoption and 
adaptation of the Standards be voluntary. Readers, however, are urged to consider applying 
these principles in a systematic manner to their data collection activities. Those involved with 
the CEDCAR Standards Project believe that to do so will greatly enhance the accuracy and 
credibility of education data. 
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Comments and Suggestions 

The Standards for Education Data Collection and Reporting are intended to stimulate 
an ongoing process of review and improvement of education data collection and reporting 
activities. In order for the Standards to serve as a useful guide in the data quality improvement 
effort, the experiences and opinions of users must be incorporated into future revisions of this 
document. 

The best practices included in this document were suggested and refined by a group of 
experts in the collection and reporting of education data. A different group of experts may 
have arrived at a slightly different set of best practices. Users of this document are encouraged 
to contribute to the quality of the Standards by providing comments and suggestions on any 
practices that may have been omitted. 

Please send comments on your experiences using the Standards or suggestions for 
improving the value and utility of the Standards to: 

NCES CEDCAR Standards Project 
555 New Jersey Avenue, N.W. 

Room 410 
Washington, D.C. 20208-5651 

If you are interested in adding your name to the list of persons receiving updated versions of 
the Standards, please write to the above address. 
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1.0. Introduction to Management of Data Collection and Reporting 



Successful implementation of the standards and guidelines articulated in Phases 2-6 of the 
Standards (Design, Collection, Processing, Analysis, and Reporting) depends largely upon the creation of 
an organizational environment and organizational structures that encourage and fully support the 
production of high quality information. 

Data are not free. Organizational resources must be devoted to the designing, collecting, 
processing, analyzing, and reporting phases of a data collection activity. Therefore, data-related activities 
must be managed and coordinated in order to focus available resources where they are most needed and 
in the most efficient and cost-effective manner. 

The standards included in this phase are all based on a key assumption-that there is a clear and 
important information need that cannot otherwise be met. Therefore, at the most fundamental level, 
processes must be put into place that provide the foundation for sound management and policy decisions 
about which data collection and reporting initiatives to pursue. Sueh decisions must be based on adequate 
information and must include the timely involvement and participation of all interested parties. 

The first standard in this phase describes the creation of an organizational culture that is conducive 
to a quality management philosophy. Within this environment, management procedures must be put into 
place to ensure successful implementation of data collection activities. The remaining three standards deal 
with justifying, supporting, and managing an individual data collection. 

Words in italics are defined in the glossary. 
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GUIDELINES 



1.1.1. Procedures should be implemented for routinely sensing the information needs of legislators, 
policymakers, practitioners, and the public. 

1.1.2. Clearinghouses should be established and maintained for documenting reports, collection 
instruments, definitions, and records of available data. 

1.1.3. Procedures should be developed for reviewing and approving data collection requests. Efforts 
should be made to coordinate federal, state, and local reviews, when feasible. 

1.1.4. Advisory task forces or committees, comprised of data requestors, providers, producers, and users, 
should be established at every level (e.g., federal, state, local) to provide guidance on coordinating 
data collection activities, 

1.1.5. Regular evaluations of ongoing data collections should be scheduled to: 

■ Assess continuing need 

■ Analyze accuracy 

■ Examine utility 

■ Review collector and provider burden and costs 

■ Evaluate confidentiality procedures 

■ Review propriety/appropriateness 

1.1.6. Data producers and providers should adhere to standard definitions of data elements. Definitions 
of data elements should be periodically reviewed, published, and made available to data providers, 
data producers, and data users. 
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GUIDELINES 

1.2.1. Data requestors should demonstrate that the data to be produced will be of sufficient value, 
applicability, and usefulness to justify the cost and burden. The specific cost of collection to both 
data producers and providers, as well as the burden imposed, should be identified. (See 1.2.1. 
Checklist for Justifying Data Collection Activities.) 

1.2.2. A determination should be made about whether the data can be obtained more appropriately from 
other sources within or outside the collecting agency. Reasons should be specified why similar 
available data cannot be used for the stated purposes. 

1.23. An opportunity should be provided for comments by data providers and users on the proposed 
collection of information. Concerns should be documented, along with the responses to those 
concerns. 



RELATED STANDARDS 



2.4. Standard for Assessing the Value of Obtainable Data 
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1.2. Standard for Justifying Data Collection Activities I 
tXL ChtMst for JwWm CoacctioB ActMdes • \ 

1. Document the circumstances that make the collection of information necessary, including any legal 
or administrative requirements. 

2. Indicate as specifically as possible how, by whom, and for what purpose the data will be used. 

3. Determine whether available data can be used to meet an emerging information need before 
initiating a new collection. 

4. Identify required data collection activities, as well as the accuracy and specificity necessary to 
achieve collection objectives. 

5. Analyze the costs and benefits of the proposed data collection to the producer and provider and, 
where appropriate, the costs of alternative strategies, 

6. Review the terminology and data definitions io be used in the data collection f o ensure that they 
conform to accepted usage. Any deviations from accepted usage should be explained. Definitions 
should conform whenever possible to nationally developed definitions to ensure that the data 
produced will be comparable to data produced by education agencies and organizations at the 
school, district, state, and federal levels. 

7. Document data providers' concerns and data requestors' responses to those concerns. 
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GUIDELINES 

1.3.1. Data requestors, users, and providers should be involved in planning data collection activities. 
Preliminary plans for data collection, processing, analysis, and reporting should be prepared for 
discussion with, and review by, these groups. 

1.3.2. Efforts should be made to coordinate data collection activities that collect similar information (for 
the same or different purposes) at different governmental levels (federal, state, local) and within 
agencies at each level. 

■ Interagency and intergovernmental meetings should be conducted routinely to determine 
whether multiple agency data needs can be met by a single data collection and to encourage 
agencies to share appropriate information in order to reduce the burden on providers. 

■ General education statistical collections should be consolidated when possible with collections 
required for administrative or regulatory purposes. 

■ Education agencies should cooperatively explore and establish data systems that will expedite 
the exchange of information to meet federal, state, and local data needs. 

1.3.3. Data collection activities should be scheduled in consultation with data providers at the local, 
state, and federal levels to accommodate annual planning, recordkeeping, and processing 
requirements. Early notification of plans and specifications should be provided for new or 
changing data collections that impact current computer or recordkeeping systems. 

1.3.4. Data providers should receive training and technical assistance. 
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GUIDELINES 

1.4.1. A description of the goals and objectives for the data collection activity should be developed and 
provided to all staff. Top management should demonstrate a continuing commitment to data 
collection by providing the resources, strategies, and related tools or methods necessary to achieve 
these goals and objectives. 

1.4.2. Specific responsibilities, along with commensurate authority to carry out these responsibilities, 
should be assigned to an individual or unit for management and coordination of the data collection 
activity. 

1.4.3. Orientation and training should be conducted for agency staff to ensure data quality consciousness, 
improve skills in producing high-quality data, and address standards of design, collection, 
processing, analysis, and reporting. 




PHASE 2. DESIGN 



2.0. Introduction to Design 

2.1. Formulating and Refining Study Questions 

2.2. Choosing the Data Collection Methods 

2.3. Developing a Sampling Plan 

2.4. Assessing the Value of Obtainable Data 

2.5. Transforming Study Question Concepts into Measures 

2.6. Designing the Data Collection Instrument 

2.7. Minimizing Total Study Error (Sampling and Nonsampling) 

2.8. Reviewing and Pretesting Data Collection Instruments, Forms, and 
Procedures 

2.9. Preparing a Written Design 
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2.0. Introduction to Design 



Once an information need has been established, the project staff can begin to explore ways to 
gather the required data. Design is the process of formulating the study questions and developing and 
describing a plan for conducting the collection, processing, analysis, and reporting of data. If the data 
collection activity is well designed, the time and resources expended during the design phase will prevent 
costly mistakes later in the project. An effective design produces accurate and useful information, thus, 
increasing the credibility and generalizeability of the findings. An effective design also promotes timely 
and efficient data collection and provides methods for resolving both expected and unexpected problems 
that may arise during data collection and analysis. The study design should incorporate methods and 
procedures established in advance to control and resolve inaccuracies in the data. 

The development of the design should be guided by the type of information to be collected, the 
unit of analysis, the types of analyses planned, and the purposes for which the data will be used. For 
example, the need to obtain highly personal or sensitive information will have an impact on the data 
collection methodology, instrument design, nonsampling errors, and virtually every other aspect of the 
activity. The design also depends, to some extent, on whether there is a need for descriptive, comparative, 
or cause-and-effect analyses. 

In developing a design, decisions made at one stage may require reconsideration of decisions made 
at an earlier stage in the design process. Decisions about sampling may influence the formulation of the 
study questions and issues. For example, program compliance issues concerning participants may require 
data collection from all recipients. Thus, although these standards are presented chronologically, not every 
data collection activity proceeds in exactly this order. In fact, many of the steps or tasks described in this 
section occur simultaneously. Standards for certain steps can greatly enhance preceding or subsequent 
stages. 

Words in italics are defined in the glossary. 
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GUIDELINES 

2.1.1. Study questions should be formulated to address the identified information needs. 

2.1.2. Study questions should be clearly defined, articulated, and reviewed to ensure that they address 
all aspects of the issues under investigation. (See 2.12. Checklist for Formulating and 
Refining Study Questions*) 

2.1.3. The study questions should: 

■ Reflect a knowledge of relevant literature 

■ Anticipate and respond to unintended outcomes 

■ Be capable of further refinement as research planning proceeds 

■ Be clear in their meaning, implications, and assumptions 

■ Eliminate bias as fully as possible to avoid any tendency to predispose the findings 

■ Attempt to break down problems into their constituent parts 

■ Be capable of being answered through practical data collection activities 

■ Focus on the information needs 

■ Be prioritized in order of importance 

■ Be broad enough in scope to cover the needs of the data requestor and, when possible, the 
needs of secondary data users 
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2.1. Standard for Formulating and Refining Study Questions 

1. Ensure that study questions have the potential to address the data needs. 

2. Ensure that each study question asks only one question. 

3. Ensure that each study question does not beg another question, or set of questions, that must be 
resolved before the current question can be answered. 

4. Ensure that study questions do not make false assumptions. 

5. Ensure that study questions do not pose a false dichotomy (i.e., make sure that the possible 
alternative answers are truly different). 

6. Ensure that study questions do not attempt to resolve nonempirical problems by empirical means. 

7. Ensure that study questions are not merely semantic (solely about how terms ought to be used 
rather than about substantive issues or distinctions). 

8. Ensure that study questions are not true or false by definition (tautological) and, therefore, unable 
to be answered empirically. 

9. Ensure that study questions have the same meaning for different persons. 
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2.2.1. The appropriate methodologies should be determined based on the information required to answer 
the study questions, on what will cause the least burden to data providers, and on the available 
resources. 

2.2.2. The design must specify the following aspects in selecting a methodology: 1) the sources of the 
data; 2) when and how often data are to be collected; 3) the collection, processing, analysis, and 
reporting procedures to be used; and 4) the potential descriptive, comparative, or causal inferences 
to be made from the data. 

2.2.3. Appropriate alternative data collection methods should be identified. These methods may include 
record abstraction, survey (mail, telephone, in-person), observation, experiments, and secondary 
data analysis. Consideration should be given to employing more than one method, where 
appropriate. (See 2.2.3. Checklist for Selecting the Appropriate Sources of Data and 2.2.5. 
Checklist for Determining the Design Relative to Group Comparisons.) 

2.2.4. The design should address the issue of whether there is a need to measure and analyze change 
over time. The design should specify whether the data to be collected are to be cross-sectional 
(designed to measure only one point in time) or longitudinal (designed to study change over time). 
(See 2.2.4. Checklist for Determining the Design Relative to Analysis of Change.) 

2.2.5. The design should address the extent to which the data are to be used to make comparisons among 
groups or explore relationships among variables. The planned sample, data collection instruments, 
and analyses must be precise enough to support the planned comparisons and inferences. The 
design should specify the type of data collection activity to be conducted (e.g., descriptive, 
comparison group, normative, case controlled), (See 2.2.5. Checklist for Determining the 
Design Relative to Group Comparisons.) 

2.2.6. The design should include procedures for measuring the quality of the data to be collected. 



RELATED STANDARDS 

2.2. Standard for Choosing the Data Collection Methods 

2.7. Standard for Minimizing Total Study Error (Sampling and Nonsampling) 
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2.2. Standard for Choosing the Data Collection Methods 




1. To identify feasible sources of information, ask the following questions: 

■ Can the information be obtained through analysis of existing data? 

■ Are records available from which the information can be compiled? 

■ Are knowledgeable persons available from whom the information can be gathered and 
assembled? 

■ Must the information be generated by controlled observation or measurement? 

2. To determine whether existing data can be used to answer the study question, ask: 

■ Are data available that are relevant to the study question? 

■ Do available data meet the criteria of reliability, validity, and other aspects of required 
technical quality? 

■ Are the data structured in a manner that provides the appropriate unit of analysis and that 
allows appropriate investigation of relationships among variables? 

■ Are the data sufficiently current 7 

■ If multiple sources of data are used, are the data sufficiently comparable? 

3. To determine whether administrative records can be used, ask: 

■ Are there administrative records that contain all the information needed (e.g., numbers and 
characteristics of students by race/ethnicity and gender) for the administrative unit/level 
required to address the study questions? 

■ Are there alternative methods of obtaining the records (once the administrative units in which 
the records are kept have been identified) that can accommodate different local 
recordkeeping, practices, and policies? 

a Are the definitions and concepts employed by the various jurisdictions involved comparable 
and uniform? Be prepared to invest in methods (i.e., crosswalking) that attempt to make data 
comparable among various jurisdictions. 

■ Does an examination of the uses for which the records are kept reveal clues about possible 
distortions relative to the study questions? Administrative record data are no more immune 
to validity issues than data from any other source. 



Continued 




2. Design 



Continued 

2*3. <*"HW for Selecting the Appropriate Sources of Data j|p 

4. To determine whether data can be collected from individual data providers, ask: 

■ Is there a person/position with access to the information being sought? If there is, can that 
person/position serve as data provider? 

■ Is the data provider being asked to obtain information from administrative records? !f so, 
consider the administrative record study checklist above (item 3). 

■ Is the data provider being asked to report information about a group or organizational unit 
in the absence of records? 

■ Is the person from whom the data are being requested the most capable person to act as a 
data provider? If not, can a procedure be designed for choosing the best data provider? 

5. To determine whether data can be obtained from an observation study, ask: 

■ Is it feasible to train staff or to hire trained observers? 

■ Do standard protocols for observation exist, or can they be created? 

■ Can the observers be granted access to the phenomena of interest? 
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2.2. Standard for Choosing the Data Collection Methods 

1. Decide from the beginning which types of design are warranted in view of the need to measure 
or analyze change. 

2. If the data are needed for a snapshot at one point in time (for example, determining how teachers 
choose to spend unplanned surplus supply budgets for the coming year), consider a cross-sectional 
design. 

3. If the data are needed for looking at change over time, develop a longitudinal design. I. is 
especially important to make sure that all aspects of the study are carefully tested prior to 
beginning data collection. 

■ Trend Design: The data are needed for different samples at several points in time (for 
example, measuring county reading scores for third-graders over a number of years). 

■ Cohort Design: The data are needed from the same group at points over time although 
different samples can be used (for example, comparing whether attitudes toward sex 
education have changed over time for different samples of those who were parents of first- 
graders in 1983). 

■ Panel Design: The data are needed from the same sample over time (for example, following 
the same group of sixth-graders to observe the relationship between initial reading scores and 
gains in reading scores or educational persistence). 

4. Consider burden, problems in maintaining participation, operational concerns, and necessary 
resources in designing the frequency of data collections. 




2. DESIGN 



2.2. Standard for Choosing the Data Collection Methods 




Carefully consider the types of comparisons needed to arswer the study questions. Consult 
available technical sources when considering which types of design allow for the planned 
comparisons. 

Designs appropriate for group comparisons include: 

■ Descriptive Design: Sample is designed primarily to describe a total population and 
subpopulations. During analysis, comparisons are made among existing subgroups on 
descriptive data. (For example, compare the extent to which the percentage of colleges 
offering remedial courses varies across geographic regions.) 

■ Normative Comparison Design: Results of data collection are to be compared with data 
already available from other sources. (For example, compare county per-pupil educational 
expenditures with state and national averages.) Make sure the units and timeframes will be 
comparable (e.g., same grade levels and same year). 

■ Comparison Group Design: Groups are specifically created for comparison purposes. They 
are then compared based on other characteristics of interest. 

■ True experimental design: Assignment to groups is random (e.g., comparing college 
persistence for students with similar college entrance test scores randomly assigned to 
groups that will or will not participate in student support services). 

■ Quasi-experimental design: Assignment to groups is not controlled by data collector 
and is not random (e.g., comparing college persistence for students with similar college 
entrance test scores who have and have not participated in student support services). 
Make sure that there is statistical control for other factors that may influence entrance 
into the groups and that may influence outcomes to be studied. 

■ Case Control Design: Groups are selected because they either have or do not have the 
condition being studied. They are compared with regard to other variables judged to be of 
relevance to the condition (e.g., comparing learning disabled students with those who do not 
have learning disabilities by birth weight, controlling for socioeconomic status). 

2. Consider combining designs (e.g., longitudinal and comparison group designs) to provide designs 
necessary to answer study questions. 

3. Make sure that the design chosen to make group comparisons is consistent with sample plans, 
form design, data collection procedures, and the analysis plan. 
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2.3. Standard for Developing a Sampling Plan 



PURPOSE; To ensure that the data providers represent the populations) of interest with a level of 
accuracy that allows for answering the study questions /II 



GUIDELINES 



2.3.1. A determination should be made about whether it is necessary to obtain data from every member 
of the population of interest to meet the needs of the data requester or whether sampling methods 
can be employed. (See 2.3.1. Checklist for Choosing Between Universe and Sample Surveys.) 

2.3.2. The development of a sampling plan for any data collection should be considered an integral part 
of the design, not a separate stage undertaken after the design has been developed. 

2.3.3. The design should include a written, detailed sampling plan to ensure that the characteristics of 
the sample relative to the population under study are known and that the magnitude of sampling 
error in the estimates of population parameters can be assessed. (See 2.3.3. Checklist for 
Identifying Sampling Plan Contents.) 

2.3.4. The availability and quality of lists that cover and count the population of interest (population or 
universe frames) and the availability of budgeted funds are important factors that should be 
considered in developing the sample plan. 

2.3.5. Estimates of sampling error for the key statistics based on known or estimated distributions should 
be calculated during the sample design. 

2.3.6. Special designs should be considered to ensure that there is an adequate number of respondents 
to represent small subpopulations of interest (e.g., racial, or ethnic groups, groups of schools 
uelined by size or some other quality). 

2.3.7. If the study has multiple questions that would require different sampling designs for getting the 
best estimates, a design should be developed that represents the best compromise within the range 
ol tolerable sampling errors, costs, burden, and other considerations. 

2.3.8. If a panel design is cht sen, procedures to maintain contact with subjects should be designed To 
ensure the usefulness of the data, sample attrition rates should be included in all estimates of 
sample size. 
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23. Standard for Developing a Sampling Plan 

Checklist for C^oosiiig Betl^j^^ Surveys 



1. If data must be reported for each member of the target population to answer the study questions, 
a sample will not be useful. For example, if a measurement must be reported to the public for 
each school district, data must be collected from every district. However, district data may 
sometimes be based upon a representative sample of its schools. 

2. Determine whether the population whose characteristics are to be reported can be described by 
measurements of a sample. Determine if the precision provided by a sample survey will meet the 
information requirements. 

3. Develop estimates of the aggregate respondent burden associated with universe versus sample 
surveys. 

4. G>nsider the special burden that recurring surveys may impose on data providers who are already 
required to report administrative data in different ways to other sources. 

5. Develop estimates of the costs of administering and analyzing alternative universe and sample 
surveys. The characteristics of the sample employed in these estimates are dictated by the 
precision required. 

6. Determine whether resources will be freed by using a sample rather than a universe survey and 
consider devoting some of these resources to reducing nonsampling error by using techniques such 
as improved measurement, more intensive interviewing and interviewer training, nonresponse 
followup, and alternative measurement techniques. Expected reductions in inaccuracy or errors 
resulting from the use of these procedures should be taken into account when choosing between 
universe and sample survey;. 
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Checklist for Identifying Sampling Plan Contents 



IS" 



i. 

2. 



4. 

5. 
6. 
7. 

8. 
9. 
10. 

11. 

12. 
13. 
14. 



Define the target population as completely as possible. 

Describe the sampling frame with regard to the source, reference date, number of units, and 
quality of the frame (e.g., coverage of population of interest, completeness). 

Describe the nature of the proposed sampling-probability versus nonprobability. (If 
nonprobability sampling is used, explain the implication for the accuracy of population estimates.) 

Specify and justify the sampling technique (e.g., simple random, systematic, matrix, stratification, 
multistage, cluster, probability proportional to size). 

Describe any stratification and clustering procedures. 

Specify the units selected or planned for the sample at each stage. 

Justify the proposed sample size and describe the effects of sample size on the precision of the 
population estimates required to satisfy the information needs. 

Describe the procedures for allocating the sample size at each stage. 

Specify the number of sampling units selected at each stage. 

Describe any measures of size that are stated for sampling with "probability proportionate to size" 
and the types of estimates that are likely to be more precise as a result. 

Develop procedures for dealing with nonresponse and describe the likely impact of nonresponse 
on population estimates. 

Specify the nature of sample weights to be applied, if any. 

Give estimates of expected sampling errors based on known or estimated distributions. 
Specify clear, detailed operational procedures for drawing the sample. 
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GUIDELINES 

2.4.1. The value and usefulness of the data collection activity that is actually feasible should be 
reassessed and weighed against the resulting response burden and financial costs. 

2.4.2. Tradeoff" between utility and data accuracy should be evaluated in view of the uses to which the 
data will be applied. 

2.4.3. In planning the data collection activity, sampling and nonsampling error for the key statistics 
should be estimated to determine whether the data collection activity that is feasible is worth 
undertaking. 

2.4.4. Regardless of whether action will be taken or policy formulated on the basis of the data to be 
collected, consideration should be given to other benefits of having empirical answers to the study 
questions (e.g., credibility, impartiality, efficiency, chances of success, informed decisionmaking). 

2.4.5. As the design of the data collection proceeds, the value of obtainable data should be assessed 
regularly and alternative approaches considered. 



RELATED STANDARDS 

1.2. Standard for Justifying Data Collection Activities 



2-13 

41 



Standards for Education Data Collection and Reporting 



iiiW pi ■ : ^i^PPW 

Transforming 


Stoty J&mt oo Concepts llili 

Hit 

-••^■•^ - ■■ ■■ 


PURPOSE: To ensure thai measures of the concep 


ts of interest are reliable and valid. ■::) 



GUIDELINES 

2.5.1. An operational definition should be developed and should describe in a concrete and specific 
manner precisely how tc obtain a measurement of a concept so that anyone could repeat the steps 
and obtain the same measurements. 

2.5.2. An operational definition should be consistent with standard data element definitions where 
appropriate. 

2.5.3. Where possible, more than one measure of a key concept should be developed and used. 

2.5.4. A measure should be tested to ensure that it is reliable; that is, it produces consistent results and 
does not fluctuate in an uncontrolled way from one trial to the next. (See 2.5.4. Checklist for 
Determining the Reliability and Validity of Measurements.) 

2.5.5. A measure should be tested to ensure that it is valid for the intended purpose; that is, it measures 
what it is intended to measure. (See 2.5.4. Checklist for Determining the Reliability and 
Validity of Measurements.) 

2.5.6. A measure should assess a single dimension. 

2.5.7. The levels of measurement should be related to the analytical plan. Measures can be nominal 
(response categories are qualitatively different), ordinal (response categories can be ranked on a 
continuum), interval (response categories represent equal quantities of the variable measure), or 
ratio (response categories represent quantities that can be multiplied). 

2.5.8. A scale or index should be developed when the variable is best measured by a set of categories 
or a range of scores. 

2.5.9. All measurement tools and procedures should be pretested under the same conditions that will be 
present during the main data collection. 

2.5.10. For continuing, long-term data collection activities, pretests should incorporate evaluation studies 
to demonstrate whether key items are reliable, valid, and unbiased measures of the concepts of 
interest. 
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1. Use more than one of the following tests to determine a measure's reliability. Responses will be 
the same (positively correlated) for a reliable item. 

■ Administer the same item twice to the same persons or groups and compute the correlation 
between the two scores. 

■ Correlate the results obtained from two halves of the same set of items, 

■ Correlate each item with the total score and then average those correlation coefficients. 

■ Correlate each item with every other item and then average those coefficients. 

2. Use more than one of the following tests to determine a measure's validity. 

■ When appropriate, review literature that evaluates the validity of existing instruments. 

■ Assess each item to determine whether experts, data providers, and users believe that the item 
measures what it is intended to measure. 

■ Assess each measure to determine whether the item can distinguish known differences (i.e., 
administer the measure or instrument to individuals who are known to differ on the 
characteristic and compare the results). 

■ Assess each measure to determine whether the item produces results that are similar to the 
results produced by other measures attempting to measure the same construct. (For example, 
do students taking a new achievement test have scores similar to their scores on other 
achievement tests that have been validated?) 

■ Where appropriate, assess each measure to determine whether it has the ability to identify 
future behavior or change. 
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2.6. Standard for Designing the Data Collection Instrument ;|| 

PURPOSE: To ensure that the data collection instrument promotes effective communication between 
the data providers and the date collectors in order to obtain accurate information; 

GUIDELINES 

2.6.1. The data collection instrument should reflect the study questions and objectives. 

2.6.2. Throughout the development of the data collection instrument, attention should be given to 
meeting the requirements of the analysis plan while minimizing burden. 

2.6.3. The data collection instrument should be refined and evaluated throughout the design and pretest 
phase. 

2.6.4. The data collection instrument should provide reliable and valid information on the specific 
questions to be answered or objectives to be achieved. Items should have the same meaning to 
all data providers, should fall within the language and knowledge capabilities of all data providers, 
and promote accurate responses. (See 2.6.4. Checklist for Designing the Data Collection 
Instrument.) 

2.6.5. The format of the instrument should minimize errors (e.g., coding, keying, skip-pattern). For 
example, precoding information on the instrument itself should be considered when practical and 
where it does not compromise the accuracy of results, unduly increase burden, or reduce data 
provider motivation to supply the information. 

2.6.6. An outside review of the instruments should be conducted. 

2.6.7. The language and format of the data collection instrument should be appropriate for the targeted 
data providers. 

2.6.8. Consideration should be given to a variety of available media (e.g., in-person, mail) during the 
design of the data collection instrument. 



RELATED STANDARDS 

2.5. Standard for Transforming Study Question Concepts into Measures 

2.8. Standard for Reviewing and Pretesting Data Collection Instruments, Forms, and Procedures 
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2.6. Standard for Designing the Data Collection Instrument 




l. 

2. 
3. 
4. 
5. 



6. 
7. 
8. 
9. 

10. 

11. 



Provide clear and sufficient instructions for completing the data collection instrument. Provide 
detailed instructions for individual items when necessary. 

Make definitions of data elements consistent with standard definitions for those data elements, 
when possible. 

Provide definitions for any words in the data collection instrument whose meaning may be 
ambiguous. 

Examine each item in the data collection instrument to make sure that the information is needed 
for the purpose of the data collection. 

Make sure that the purpose of each item on the instrument is understandable to the data provider 
Extlain to da provider why questions are included that have no apparent connection to the top c 
of t ^ drcollLion. (For example, background questions might be asked in order to identify 
connections between people's backgrounds and their views on teacher competency testing.) 

Ensure that the requested information can be provided by the data providers. 

Minimize the amount of time data providers will need to complete the data collection form. 

Wherever possible, use units of measurement that are familiar to the data providers. 

Usp standard language and avoid jargon and abbreviations. Make sure that the technical terms 
uid ZwS*<o the data providers. Review questions for clarity. Keep questions short 
and simple. 

Desien the item sequence of the data collection instrument to increase the data provider's ability 
fo complet he data collection. Keep topic-related questions together and provide transitions 
between topics. Ensure that the item sequence does not influence responses to later questions. 

Determine the item format that will provide the type of measurement necessary to answer the 
study question (e.g., multiple choice, check list, rankings, ratings, open-ended). 
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12. 



13. 



14. 



15. 



16. 



17. 



18. 



19. 



Make sure that the items on the data collection instrument place the least possible burden nn n» 
data providers. Find out how data providers usually keep or procis E? nfoS k 

aff.rmahvely ,o ,he firs, question should be asked ,he question o t intelS,) ^ 

When a possible response .o an item may be considered somewhat socially undesirable ,rv ,n 
phrase the queslion and/or response categories in a wav chat m »v« ,Z a . , ' ^ 
comfortable abour giving ma, rfp.y. Fading ™/c olly^ 'XSr™ 

=^m^^ 

zzzxzsz phrascs ,ha ' raay bias ,he responses are avoMed - * ■*■» ° f >~ - 

ti^^ 

When possible, conduct cognitive research (e.g., connitive labs and/m r™.... 

component of the prc.es, to help ensure ,ha, respo^nZmpre end I vtUs and TZ h " " 

access to ,he information guested, and are moiiva.ed ,o particle hoVZ ^ ^ 
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GUIDELINES 

2.7.1. Attention should be paid not only to identifying potential sources of error but also to reducing 
total error (sampling and nonsampling errors) in the design phase. (See 2.7.1. Checklist for 
Designing to Reduce Nonsampling Error.) 

2.7.2. Potential sources of constant error (bias) and variable error (variability) should be identified in the 
design phase of the data collection activities. (See 2.7.2. Checklist for Identifying Types and 
Sources of Study Error.) 

2.7.3. Attention should be paid to identifying, measuring, and reducing nonobservation and observation 
errors. 

2.7.4. Among the nonobservational sources of error to which attention should be paid are errors of 
coverage, unit and item nonresponse, and sampling. 

2.7.5. Among the observational sources of error to which attention should be paid are interviewer, 
instrument, data provider, modes of data collection, and data processing. 

2.7.6. Consideration should be given to building evaluation studies of the methodology and data 
elements into the data collection design. For example, ask a portion of the items twice to a 
subsample of respondents (reliability); check the correctness of responses by obtaining verification 
from an independent source (validity); study the characteristics of nonrespondents to determine 
if they differ from those of respondents; and experiment with changes in instrument wording or 
item order. 

2.7.7. Quality control procedures should be developed and budgeted in the design phase. 

2.7.8. Nonresponse followup and item imputation procedures should be planned to reduce error resulting 
from unit (total) nonresponse and item (partial) nonresponse. (See 2.7.8. Checklist for 
Minimizing Unit and Item Nonresponse.) 
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2.7. Standard for Minimizing Total Study Error (Sampling and 
Nonsampling) 
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1. Ensure that data providers fully understand the instructions and instrument items. 

2. Choose data collectors who are credible to data providers when issues of credibility might result 
in biased results. 

3. Choose data collectors who are familiar with the population from whom data will be collected. 

4. Avoid sensitive or invasive questions where possible. When such items are necessary, careful 
wording and placement should be considered. 

5. Develop clear and concise instructions for data providers and data collectors to eliminate 
inconsistent responses caused by ambiguous instructions. Instructions should be pretested along 
with the data collection instruments. 

6. Use commonly accepted definitions and terminology so that all items, instructions, and other 
materials can be easily understood. 

7. Develop and implement training, monitoring, and refresher training for personnel to reduce 
inconsistencies in the implementation of procedures by interviewers, data entry personnel, coders, 
and other project staff. 

8. Design manual and machine edit checks with verification of values that are invalid or are out of 
expected range to detect and eliminate systematic mistakes in scoring, data entry, coding, 
tabulating, machine editing, and other processes. 

9. When possible, identify errors using outside resources including: 

■ Comparisons with data from other sources or previous collections (to detect implausible 
consistency or inconsistency) 

■ Analyses seeking implau^ Me patterns associated with processing steps, particular field sites, 
periods of collection, etc. 

■ Outside reviews to ensure that analyses and interpretations are examined critically 

■ Analyses of quality control practices 
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2.7. Standard for Minimizing Total Study Error (Sampling and 
Nonsampling) 
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1. Check the study design for error using an error classification scheme. 

2. One classification scheme identifies errors by the variability of error. To follow this scheme, 
check the study design for constant (bias) and variable error, 

■ Constant Error or Bias: These are errors that affect the statistics in all implementations of 
the study design. Bias is a constant over all data collections using the same instrument. 

■ Variable Errors: These are errors that vary over each data collection using the same design. 
The phrase "level of precision" usually refers to the sum of the variable errors. High 
precision means low variability. 

■ Sampling variance: Denotes changes in the values of a statistic during possible 
replications of a survey where the sample design is fixed but different individuals are 
selected for different samples. 

■ Response variance: Denotes variation in answers to the same question if repeatedly 
administered to the same person over different trials or replications (reliability). 

3. Errors aie also identified by their source-that is, nonobservation and observation errors. These 
types of error can be sources of both constant (bias) and variable error. 

■ Nonobservation Errors: These are errors that arise because measurements were not taken 
on part of the population. 

■ Coverage errors: Errors that occur because some persons are not part of the list or 
sampling frame used to identify members of the population. 

■ Nonresponse errors: Errors that occur because some persons on the sampling frame 
cannot be located or refuse to provide information. (See 2.7.8. Checklist for 
Minimizing Unit and Item Nonresponse.) 

■ Sampling errors: Errors that occur because the statistic is computed for a sample of 
the population rather than for the entire population of interest. 




Standards for Education Data Collection and Reporting 



Continued 




■ Observation Errors: These are deviations in the answers of data providers from their true 
values on the measure. 

■ Interviewer errors are associated with effects on data providers that stem from the 
differences in interviewer, and the ways that they administer the data collection 
instrument. 

■ Instrument errors are effects ihat item from the wording or flow of the data collection 
instrument. 

■ Data provider errors are effects that stem from differences in cognitive abilities, 
availability of information, and motivation to answer questions among data providers. 

■ Mode of data collection errors are associated with the type of collection (e.g., answers 
provided over the telephone may be shorter than answers given in face-to-face 
interviews). 

■ Data processing errors are errors associated with the coding and editing of data (e.g., 
errors in coding open-ended questions or key punch errors). 
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2.7. Standard for Minimizing Total Study Error (Sampling and 
Nonsampling) 



2.7.8, Checklist for Miwimteinjg linit and Item Noorespome 
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1. Make minimizing nonresponse a key consideration, beginning with the design phase. Use careful 
planning to avoid nonresponse problems. 

2. Assess the data availability and/or the data providers' expected willingness and interest to provide 
the data. 

3. Select the data provider group best able to give correct information with the least amount of 
burden. 

4. Estimate response burden for each item of the data collection instrument based on a pretest and, 
where possible, the response burden imposed by related collections, 

5. Ensure that the response burden does not exceed a reasonable level and is justified by the intended 
use of the data. Avoid requesting information that may be available elsewhere, that is of marginal 
use to study purposes, or that imposes a heavy burden. 

6. Ensure that data collection forms are clear, use language and units that arc familiar to the intended 
data providers, and conform as much as possible to the way in which the data or information are 
kept or processed by the data provider. 

7. Whenever possible, determine and specify expected and acceptable response rates prior to 
conducting the data collection. Acceptable response rates vary with the type of data collected, 
how respondents differ from nonrespondents, and the degree of precision required. 

8. Before collecting data, determine a realistic level of effort, budget, and time needed to achieve the 
necessary responses. 

9. Provide for adequate nonresponse followup in the budget and time schedule. Achieving a high 
response rate nearly always requires considerable followup with data providers. For example, 
national mail surveys typically require considerable telephone followup, often involving nearly 
half of all daia providers. 
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10. 



11. 
12. 



13. 

14. 

15. 
16. 



Ensure that there are adequate directions for mail surveys or other self-administered data collection 
instruments. 

» Make instructions clear, concise, and easy to read. 

■ Provide uniform explanations and definitions so that data providers are able to understand 
the data items and provide accurate responses. Make sure data providers have a clear idea 
of inclusions and exclusions. 

■ Provide instructions for the mechanical steps needed to respond or to provide an answer (e.g., 
"Mark an X in the appropriate box for each item."). Provide an example of the desired 
response technique on the survey form or electronic medium (e.g., include an illustration of 
a pencil marking the letter X). 

■ Use attractive designs and incentives (e.g., a willingness to share results) to encourage 
responses. 

■ Provide adequate time for data providers to complete data collection forms. 

■ Make sure that the due date for completed data collection forms and the address and 
telephone number of the data collector are included in the directions. Provide a postage-paid 
return envelope and a telephone number to accommodate data provider inquiries. 

Make provisions to monitor unit response rates by key strata throughout the data collection. 

To lower unit nonresponse, develop concrete plans for obtaining data that are not supplied initially 
by data providers. During the design phase of a mail data collection, telephone followups should 
be included in the budget. The followups should be carried out when needed during the data 
collection phase. Keep a log of all telephone and mail followup efforts, and record reasons for 
nonresponse. 

Identify key items for nonresponse followup in the survey design. Provide for field or office 
response edits and telephone followup of key item nonresponse. 

Prepare a training manual for data collectors. Ensure that interviewers selected for nonresponse 
followup are well trained and capable of effectively communicating the importance of the data 
collection activities. Monitor interviews. 

Develop plans to identify the characteristics of nonrespondents for use in weighting and 
imputation. 

Plan to report response rates along with the data. For data to be weighted to national estimates, 
response rates should be calculated on weighted data. 
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17 Plan to report item response rates. Items with unacceptable levels of nonresponse ordinarily 
^ZZ&VV** to the findings. If such items are reported, point out that they are based 
on low response rates and provide appropriate cautions. 

is If aff «eoated totals are to be reported and there are missing data, make provisions to impute the 
1 JSr«Sl*te souL such as previous years' data or through the use of techn.ques 
such as nearest neighbor imputation. 

19 Md» Plans .0 flag da.a .hat have been impu.ed on .he da.a file and to describe .he impu.a.ion 
piocess (See 5.3.8. Checklist for Imputing for Item Nonresponse.) 
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2.8.1. A review of data collection forms, tests, interview schedules, and/or other data collection 

<ies lg n. All .nstrumcnls, instructions, and procedures should be prcresled. particular 
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2.8.4. 



2.8.5. 



2.8.6. 
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2.8.9. Pretest results should provide information on the following: 

■ Data collection instruments, including the reliability and validity of measures and instructions 

■ Data collection procedures 

■ Organization/institution liaison and scheduling procedures 

■ Interviewer/data collector training and quality control 

■ Data provider burden 

■ Nonresponse and adjustment procedures 

■ Receipt control and data processing 

■ Aspects of the proposed data analysis and reporting 

RELATED STANDARDS 

3.1. Standard for Preparing for Data Collection 
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GUIDELINES 

2.9.1. A written design should be prepared during the initial planning of a data collection. (See 2.9.1. 
Checklist for Preparing a Written Design.) 

2 92 The written design should be reviewed at early stages by the data requestor and other 
decisionmakers to ensure that they approve of the instruments, procedures, resource requirements, 
budget, and timelines. 

2.9.3. The design should include a clear and specific description of the methodologies selected to address 
the study questions and the rationale for using the proposed methodologies. 

2.9.4. The sample design, if applicable, should be clearly described. (See 2.3.3. Checklist for 
Identifying Sampling Plan Contents.) 

2 9 5. The design should include a clear description of data elements, data collection instruments, and 
data collection procedures (e.g., administrative records, mail surveys, interviews, or observations). 
This description should include definitions of data elements or items. The design should also 
describe procedures for contacting and communicating with data providers and for selecting and 
training data collectors. 

2 9 6 The design should specify clearly how the instruments and procedures will be pretested and used 
in a data collection. The design should also specify the criteria for evaluation, as well as 
mechanisms for revising or improving the plan. 

2.9.7. The design should specify procedures fcr data collection, processing, analysis, and reporting. The 
design should reflect input from data collectors, processors, analysts, and reporters, as well as 
informing them ot their responsibilities. 

2.9.8. The data collection plan should provide for identification of potential sources of and reduction of 
total study error (sampling and nonsampling). 

2.9.9. A time schedule should be specified for major events, as well as procedures for corrective action 
if major milestones are missed. 

2 9.10. In all cases, the design should provide for evaluation of the data collection activities to determine 
the extent to which major goals and milestones have been achieved and to identify areas where 
changes are needed. 
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2.9.11. The design should be kept current and should document changes and modifications in the plan. 



RELATED STANDARDS 

2.7. Standard for Minimizing Total Study Error (Sampling and Nonsampling) 

3.1. Standard for Preparing for Data Collection 

4.1. Standard for Planning Systems Requirements 

4.5. Standard for Planning for Data Preparation 

5.1. Standard for Preparing Analysis Plan 

5.4. Standard for Estimating Sampling and Nonsampling Errors 
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2.9, Standard for Preparing a Written Design 

%.%t OitdUfct for Preparing * Written Btsign 

: , 

The written design should include the following: 

1. Identification of policy considerations, program goals, and problem areas that the data collection 
activities are intended to address 

2. Summary of relevant literature and related issues 

3. Identification of primary target audiences 

4. Articulation of basic assumptions 

5. Statement of the study questions 

6. Description and justification of methodologies or design 

7. Definitions of relevant concepts; descriptions of how terms are operationalized and measurement 
scales selected 

8. A detailed sampling plan that includes a description of the sampling frame, design, and procedures 
(See 2.3.3, Checklist for Identifying Sampling Plan Contents.) 

9. Description of procedures for collecting data and for developing or selecting data collection 
instruments; citation of existing information sources 

10. Description of the evaluation or pretest and its design 

11. Description of the data processing plan 

12. Description of planned analyses and reports 

13. Time schedule 

14. Budget 
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PHASE 3. DATA COLLECTION 



3.0. Introduction to Data Collection 

3.1. Preparing for Data Collection 

3.2. Selecting and Training Data Collection Staff 

3.3. Ethical Treatment of Data Providers 

3.4. Minimizing Burden and Nonresponse 

3.5. Implementing Data Collection Quality Control Procedures 

3.6. Documenting Data Collections 
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3.0. Introduction to Data Collection 



The collection of education-related statistics frequently depends upon the efforts of individuals for 
whom assembling data is a minor portion of their responsibilities. For the collection phase to be a 
success, data collectors must enlist the cooperation of these data providers, establish a communications 
channel, conform to local and state collection regulations and protocols, and minimize the burden and 
intrusiveness of the data collection activities. Data collectors must effectively handle myriad management 
activities, communicate with data providers at different locations and organizations, schedule data 
collection activities that may involve hundreds of people, and provide for die manual or electronic transfer 
of data from numerous collection sites across the country. 

The structure of the American education system creates unique data collection considerations. For 
example, the federal government collects data from state education agencies which, in turn, rely on local 
school systems to gather data from students and teachers within schools. Thus, the many operational and 
authority levels through which data must pass, along with the need to recognize legal and ethical 
obligations, add to the complexity of data collection and the need for collection standards. 

The standards contained in this phase encompass key principles associated with the data collection 
process. TV. standards are intended to address all types of data collections; however, individual standards 
may apply to some types of activities more than others. 

Words in italics are defined in the glossary. 
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GUIDELINES 

3.1.1. Data collectors should participate in the development of a data collection plan and schedule during 
the design phase. The data collection plan snould include: 

■ Rationale for data collection 

■ All data collection instruments and the instructions for completing them 

■ A list or description of the target population, including information for locating each selected 
data supplier or unit (e.g., mailing address, telephone number, identification number) 

■ A description of the sampling design, when appropriate 

3.1.2. The plan should also describe all administrative steps used in each data collection activity, 
including: 

■ Distributing the data collection forms and instructions 

■ Recording data on the forms or other media (e.g., floppy disks, magnetic tape) 

■ Returning the forms 

■ Keeping a record of each time a form is returned 

3.1.3. Data collectors should consider alternative approaches to collecting data (e.g., technologies, media, 
procedures). 

3.1.4. Pretest results should be reviewed to determine if the data collection plan needs modification. 

3.1.5. Procedures should be developed to monitor data collection activities. 

3.1.6. Data collectors should contact appropriate individuals before initiating data collection to: 

■ Explain the purposes of the data collection activity 

■ Obtain appropriate authorization and clearances 

■ Review any relevant laws, regulations, or administrative procedures that may affect the data 
collection activity 

■ Agree on data collection dates and milestones 

■ Establish a means of communication (e.g., contact person) 

■ Identify the most appropriate and knowledgeable data providers 
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3.1J. Materials for training data collection staff should be developed. 

3.1.8. An adequate supply of data collection instruments, instructions, and administrative guides, 
including alternative forms for special populations, should be made available at each site before 
beginning the data collection. 

3.1.9. Procedures should be implemented to ensure that data collection forms (or electronic formats) are 
received by the intended data providers. These procedures may include calling a small sample 
of data providers to verify receipt, providing a postcard to be returned by data providers when 
they receive the forms, or using electronic mail. 

3.1.10. Data collectors should develop and implement recordkeeping plans, including maintaining a log 
noting return of forms or electronic media. 

3.1.11. Data collectors should develop and implement security plans for handling and storing confidential 
or sensitive documents or electronic media. 

3.1.12. Arrangement wr delivering data should be coordinated with data processing and analysis staffs 
and analysts. 



RELATED STANDARDS 

2.2. Standard for Choosing the Data Collection Methods 

2.5. Standard for Transforming Study Question Concepts into Measures 

2.6. Standard for Designing the Data Collection Instrument 

2.7. Standard for Minimizing Total Study Error (Sampling and Nonsampling) 

2.8. Standard for Reviewing and Pretesting Data Collection Instruments, Forms, and Procedures 
3.2. Standard for Selecting and Training Data Collectors 
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34 Standard for Selecting and Training Data Collected 
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PURPOSE? To ensum that the persons responsible for collecting and reporting data have the 



knowledge and skim to conduct the planned actives. 



GUIDELINES 

3.2.1. Data requestors should identify any special knowledge, experience, certification, or training 
required for data collectors. 

3.2.2. Selection and training should reflect the complexity of the project. For complex data collections, 
training should include, at a minimum, a thorough explanation of the goals of the data collection 
activities, instructions for deviating from or expanding on items, and methods for documenting 
the collection activity, including any problems encountered. (See 3.2.2. Checklist for Training 
of Data Collectors.) 6 

3.2.3. Training should be designed based on the collection methodology to be used. For example, more 
extensive training is usually required for instruments with open-ended items. 

3.2.4. Personnel selected and trained to evaluate programs should not be part of the staff or management 
of the program being evaluated. 

3.2.5. A contact person should be available to answer questions from data collectors before and during 
the data collection. 

3.2.6. Data collectors should be sensitive to data provider characteristics that may influence the data 
collection. Training of data collectors should address cultural, ethnic, and other population 
characteristics that may affect the data collection. 

3.2.7. Staffing resources should be sufficient to ensure that replacement personnel are available on an 
as-needed basis. 

3.2.8. Data collectors should practice with the data collection instrument (administer the test, conduct 
the interviews, transcribe the record). When the instrument requires data collectors to make 
judgments or interpretations, staff training should ensure reliability of responses among data 
collectors. 

3.2.9. A review should be conducted regarding problems encountered by data collectors throughout the 
data collection phase, and provisions should be made for followup training. 

RELATED STANDARDS 



3.3. Standard for Ethical Treatment of Data Providers 

3.4. Standard for Minimizing Burden and Nonresponse 



STANDARDS FOR EDUCATION DATA COLLECTION AND REPORTING 



3.2. Standard for Selecting and Training Data Collectors 




Training should be designed to ensure that data collectors understand: 
1, All definitions used in the data collection instrument 

2 The persons from whom or the records from which the data are to be collected (e.g., Which 
teachers will be interviewed? What classes will take the test? What records will be examined?) 

3. The date, time, and duration of the data collection activity 

4. The required procedures for administering the data collection activity 

5 Ethical and legal responsibilities to prevent unauthorized use or disclosure of data (If the data are 
particularly sensitive, it may be appropriate to require data collectors to sign statements affirming 
that they recognize their responsibilities and agree to preserve confidentiality.) 

6. Procedures for obtaining explanations or help during the data collection 

7. Limits on acceptable deviation from specified procedures 

8. Methods for reporting data 

9. Methods for reporting occurrences during the data collection that may affect data quality or other 
aspects of the findings 

10. Appropriate ways of collecting data to avoid influencing responses 

11. The need for sensitivity to characteristics of data providers that may influence the data collection 
(e.g., culturally sensitive issues) 
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PPRPOSEs To ensure that data collection activities are carried out with respect for the i 
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GUIDELINES 



3.3.1. Data collection activities should be carried out in a manner that minimizes embarrassment or 
inconvenience and avoids harm to data providers and other participants (e.g., students, teachers, 
education administrators). When personal questions must be asked, data collectors should arrange 
for data providers to respond in private. 

3.3.2. Data col'ection activities should be conducted in a manner that results in minimal disruption of 
data provider activities. 

3.3.3. When a data collection activity provides a benefit for one group of data providers, consideration 
should be given to providing an equal benefit to any control group (e.g., students should not be 
intentionally deprived of educational benefits). 

3.3.4. Data providers or their legal guardians should be notified in advance whether participation is 
mandatory or voluntary. If participation is mandatory, data collectors should discuss the relevant 
statutory and regulatory citations with data providers and explain the meaning of these citations 
in clear language. If participation is voluntary, data collectors should assure data providers or 
their legal guardians that they will not be penalized for declining to participate. At the time of 
the data collection, data collectors should remind data providers of the purpose of the study and 
whether participation is voluntary. 

3.3.5. Data collection procedures should ensure that respondents have the opportunity to provide accurate 
answers regardless of their cultural or educational backgrounds. 

3.3.6. Appropriate provisions should be made for participants who are physically disabled or who require 
other special considerations. 

3.3.7. Data collectors should explain the extent to which answers will be kept confidential. Promises 
of confidentiality should not be made if statutory, regulatory, or administrative procedures require 
or permit disclosure. 
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3.3.8. Data collectors should ensure that the confidentiality of data is protected 

■ Data collectors should not discuss confidential aspects of the data collection activity with 
unauthorized individuals. 

■ Copies of records, test scores, and other data should be kept in a secure place and delivered 
promptly to the appropriate location or person. 

■ Notes and other documentation kept during the data collection activity should not contain 
identifying information that is not expressly required by the research design. 

■ Data collection activities should be carried out in compliance with applicable federal, state, 
and local laws concerning privacy and confidentiality. 

■ Records should be destroyed upon completion of requirements for the data collection activity. 
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PURPOSE; To ensure that data collections are minimally;^ the data collected are 

representative of the poputatjon being studied. 



GUIDELINES 

3.4.1. Data collectors should secure the cooperation of appropriate individuals and organizational levels 
before beginning data collection and should follow accepted administrative protocols (e.g., states 
should be contacted before school districts; superintendents should be consulted before school 
principals). 

3.4.2. Data collectors should take steps to minimize the time, cost, and effort required of data providers 
(e.g., providing self-addressed stamped envelopes for return of forms). 

3.4.3. The data collection should be scheduled, to the extent possible, at the convenience of the data 
providers and with appropriate lead time. Established schedules should be adhered to strictly 
unless data providers request a justifiable change. 

3.4.4. Data collectors should contact data providers and the heads of their organizations before the 
collection to confirm that appropriate arrangements have been made for the data collection. 

3.4.5. Endorsements of the goals of the data collection activities by the data providers' peers and 
constituent organizations should be obtained, when appropriate, and made available to the data 
providers. 

3.4.6. Data collectors should give data providers the following information when requesting that they 
participate in a data collection activity: 

■ Purpose and need for data 

■ Estimated time and burden imposed 

■ Importance of data providers' participation 

■ Confidentiality of response 

3.4.7. Local flexibility in the data collection plan should be permitted whenever feasible. 

3.4.8. Trained personnel should be available to answer data providers' questions. Persons who want 
more information prior to responding to the data request should be able to reach trained personnel 
by telephone (the telephone number should be given to all data providers). 
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3.4- Standard for Minimizing Burden and Nonresponse 



3.4.9. Data collectors should offer data providers an opportunity to request a summary of the study 
findings when feasible. 

3.4.10. Data collections should be monitored to ensure that data collection procedures arc followed. 

3.4.11. The design, budget, and schedule should provide for nonresponse followup (including identifying 
reasons for nonresponse). Those responsible for conducting followups should be trained in 
techniques for soliciting study participation. 

RELATED STANDARDS 

2.6. Standard for Designing the Data Collection Instrument 
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GUIDELINES 

3.5.1. Procedures should be implemented to review returned data collection forms for completeness and 
consistency. 

3.5.2. Specifications should be established for acceptable range checks, valid code checks, and logic or 
consistency edits. Related data should be reviewed to help determine appropriate edits. 

3.5.3. Data collection procedures should include verifying a subsample ot data for accuracy and 
completeness (e.g., recontact a selected number of data providers; review a subsample of 
administrative records). 

3.5.4. Followup procedures should be implemented to ensure that unit and item response rates meet 
design specifications. 
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GUIDELINES 

3.6.1. Data collectors should record any problems in the data collection instruments, instructions, or 
administration identified during the pretest or data collection. 

3.6.2. Data collectors should record any deviations from the data collection plan and the reasons for the 
deviations. 

3.6.3. Data collectors should record instances of unit and item nonresponse and the reasons for their 
existence (e.g., the number of illegible or missing records), 

3.6.4. Data collectors should record information concerning each contact with data providers and other 
appropriate individuals including the purpose of the call and the resolution or action taken. 

3.6.5. Field notes should be submitted and reviewed immediately after a data collection for 
completeness, accuracy, and potential bias. Appropriate feedback should be given io the data 
collector. 

3.6.6. A summary of factors encountered during the data collection that may influence interpretation of 
the data should be included in the final documentation. 
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PHASE 4. DATA PREPARATION AND PROCESSING 



4.0. Introduction to Data Preparation and Processing 

4.1. Planning Systems Requirements 

4.2. Desiring Data Processing Systems 

4.3. Developing Data Processing Systems 

4.4. Testing Data Processing Systems 

4.5. Planning for Data Preparation 

4.6. Preparing Data for Processing and Analysis 

4.7. Maintaining Programs and Data Files 

4.8. Documenting Data Processing Activities 

4.9. Evaluating Data Processing Systems 
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4.0. Introduction to Data Preparation and Processing 



The standards in this phase cover the process of transforming raw data into a format that can be 
analyzed vnd the system by which this transformation is accomplished. The standards can be used to 
guide an agency's in-house data preparation and processing activities or to provide specifications to an 
outside data processing contractor to help ensure that its procedures meet the agency's requirements. 
These standards can also provide staff in other phases of the data collection activity with an insight into 
data processing activities. 

Data processing systems may be manual or automated and may be operated on microcomputers, 
minicomputers, or mainframe computers. Although reference is often made in this phase to automated 
systems, these standards can be adapted for any data processing system. The standards are intended to 
guide data preparation and processing efforts for individual data collection activities. Although applicable 
to automated data processing as general principles, the standards are. not intended to provide a complete 
description of all systems development, programming, or hardware requirements necessary for the efficient 
operation of a computing facility. The data processing standards of the American National Standards 
Institute (ANSI) or the Federal Information Processing Standards (F1PS) should be consulted for technic:? 1 
systems development. 

The first four standards in this phase address the development of an information processing system 
in four stages: Systems Requirements, Design, Development, and Testing. The next two standards cover 
preparing data for processing, wlvch means converting data appearing on data collection instruments into 
useable computer data files. Attention should be given to data preparation activities during the early 
stages of systems development. The final three standards address system maintenance and operations and 
data management. Taken together, the standards in this phase serve as a management tool for reviewing 
data processing plans, costs, and progress. 

As noted in several of these standards, it is essential to establish and maintain effective 
communication between the data processors and other staff members during the entire period in which the 
data collection activities are in operation. Prior to establishing a budget, the data processing staff should 
discuss the project requirements with other staif members to determine if any additional data processing 
personnel are needed. 

Because of their intertwined responsibilities, communication between the data processing and data 
collection staffs is particularly important during date preparation. Data collection forms move between 
collection staff and processing staff as they are received, logged, manually coded and checked, sent for 
followup efforts, key entered, and machine edited. Indeed, collection staff at some agencies handle several 
of the data preparation activities discussed in the data processing standards. 

Words in italics are defined in the glossary. 
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GUIDELINES 

4.1.1 , Data requestors, producers, and users should work as a team throughout the design phase to create 
clear and unambiguous specifications that define the needs of the data requestors and provide 
guidelines for data processing. 

4.1.2, In determining the requirements for the data processing system, data requestors and data 
processors should consider the acceptable level of accuracy, the complexity of the data analysis, 
and the format and media for delivering the final data, 

4.1.3, The first step in the development of an information system should be the Systems Requirements 
Stage, which defines the scope of the system, 

4.1.4, During the Systems Requirements Stage, the requestor of the automated system should provide 
a written statement describing the problems or needs that the system will address and the expected 
outcomes. 

4.1.5, Data processing staff should meet with other project members to specify and define system 
requirements, including: 

■ Proposed benefits 

■ Cost 

■ Timelines 

■ Resource requirements 

■ General constraints 

■ Alternative systems (including cost benefits) 

■ Measurable objectives to be obtained 

■ Recommendations for system development 



Continued 
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4.1.6. Specifications to meet systems requirements should include, as appropriate, guidelines for the 
following: 

■ Methods for collecting the data 

■ Instruments to be used for data collection 

■ Time period during which the data should be collected 

■ Time period for data preparation activities 



d Reporting deadlines 

■ Imputation methodology 

■ General guidelines for data conversion and processing 

■ Magnetic storage specifications 

■ Preliminary layout for final data files 

■ Preferred formats for output reports 

4.1.7. A system description should be prepared of the tasks and timelines required for data input, 
processing, and output for each function and process in the automated system (using data flow 
diagrams and system flowcharts). 

4.1.8. Preliminary examples of reports or other data presentations (such as tables) should be developed 
to demonstrate the types of infoimation the system is expected to produce and to ensure that the 
proposed system will provide the desired results. 

4.1.9. Before leaving the Systems Requirements Stage, data processors and others involved in the data 
collection activity, including the systems icquestors ano the design and analysis staff, should agree 
on all specifications. 



RELATED STANDARDS 



4.5. Standard for Planning for Data Preparation 
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4 21 The System Design Stage should build upon the system requirements specified in the Systems 
' ' ' Requirements Stage, proceeding from statements about what the system will do to descriptions of 
how the system will accomplish these goals. 

4.2.2. System specifications should include file descriptions that identify and describe data elements, 
record format, and file design, 

4.2.3. A preliminary list and description of the data elements that will be used in the automated system 
should be included in a data dictionary. 

424 The systems design should specify control features that will help to ensure the accuracy and 
completeness of each input, process, and output. These features may include audit trails, control 
totals, status flags, system performance statistics, audit and error reports, and system interrupt and 
restart procedures. 

4.2.5. The systems design should address data security and confidentiality issues. 

4 2 6 Design specifications should be written for all computer programs needed for the operation of the 
' * ' system. Each program should be designed to perform a single function or several closely related 
functions. 

427 Specifications for a computer program should describe how the program will accomplish its stated 
purposes, the processes the program will perform, the file and record formats to be used, and the 
logic to be employed. The specifications should be written so that they can be translated into 
computer program code. 

4 28 The data processing staff should consider the needs of those who will use the data when 
determining the content, format, and media for presenting final data. The likelihood that there will 
be future requests for data or analysis should be taken into account in the data processing systems 
specifications. 

4.2.9. The criteria and method to be used for evaluating how well the data processing system performs 
should be determined. 

4.2.10. A plan should be developed for testing the system that specifies the functions to be tested, the data 
required to test the program functions, and the expected test results. 
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4.2. Standard for DesigiB§||te' Processing Systems 



4.2.11. Staff from all phases of the data collection activity should understand and approve the design 
specifications before the Systems Development Stage begins. 

4.2.12. All tasks that are part of the Systems Design Stage should be fully documented to meet the needs 
of potential users as well as of programmers who may be required to modify the system for future 
use* 



RELATED STANDARDS 

4.4. Standard for Testing Data Processing Systems 

4.7. Standard for Maintaining Programs and Data Files. 

4.8. Standard for Documenting Data Processing Activities 

4.9. Standard for Evaluating Data Processing Systems 
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GUIDELINES 

4.3.1. In the Systems Development Stage, systems design specifications should be converted into written 
computer programs. 

4 3 2 Programmers and data processors should adhere fully to the agreed-upon specifications and consult 
with data requestors if there appears to be a need for any departure from the specifications. 

4 3 3 The data producer should determine whether ad hoc requests for data or analysis not specifically 
' ' * envisioned in the systems design can be handled in a cost-effective manner. The data processing 
staff should specify the cost and effort required to fulfill ad hoc requests. 

4.3.5. All programs should be fully documented to meet the needs of system users and of programmers 
who may be required to modify the system. 

RELATED STANDARDS 

4.8. Standard for Documenting Data Processing Activities 
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4.4.1. Before any system is considered ready [or operation, all programs should be tested during the 
Systems Testing Stage (individually and as a complete system) in accordance with the system test 
plan developed during the Systems Design Stage. 

4.4.2. Final testing should be performed by the data requestor or producer. 

4.4.3. All program and system test results should be documented in a test report that specifics each 
function tested the test data used, and the expected and actual results. Any unexpected findings 
should be explained and errors corrected, as appropriate. 

4.4.4. Computer program changes should be documented internally (within the programs). 

4.4.5. The appropriate members of the data producer's staff should review and aporove test results before 
concluding the Systems Testing Stage. 



RELATED STANDARDS 



4.2. Standard for Designing Data Processing Systems 
4.8. Standard for Documenting Data Processing Activities 
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GUIDELINES 

4.5.1. Planning for data preparation should begin during the early stages of a data collection activity. 
Specific factors to examine when developing the plan include: 

■ Type of data to be collected 

■ Type of edit checks needed 

■ Method for receipt control 

■ Computer system to be used for the study 

■ Timing and volume of data retrieval 

■ Sample size 

4.5.2. A detailed schedule should be developed of data preparation tasks, and copies should be provided 
to all appropriate staff including data processors. Schedule items include: 

■ Dates that training sessions will be held for manual coding and editing 

■ Date that edit specifications will be submitted to data processors 

■ Date that edit program will be tested and ready to run 

■ Dates that test data and actual data will be submitted to data entry 

■ Date that a clean file will be available 

4.5.3. The data processors should be consulted throughout the planning stages of a data collection 
activity and fully informed about the types of support they will be asked to provide during the 
activity. Data processors should receive a copy of the proposed project schedule. 

4.5.4. The data preparation and processing staffs should jointly determine the machine editing system 
to be used. A plan should be developed to familiarize data preparation staff with the different 
machine editing systems that are available. 
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4.5.5. Training materials for manual or on-line coding and editing (including a codebook and sample 
forms) and data entry instructions should be prepared. 

4.5.6. Range checks and logic or consistency edits should be established. An edit program should be 
created to detect out-of-range or invalid responses and key-entry errors. 

4.5.7. Procedures for reviewing and coding the data should be clear and complete. (See 4*6.2. Checklist 
for Coding Data.) Coding staff should be trained to follow the procedures and to document 
problems. 

4.5.8. When imputation is requested, data processors should check with the data analysts to determine 
the appropriateness of imputing data for some instances of item nonresponse. The need for and 
type of imputation depends upon the type of collection and analysis activity and the type of data. 
(See 5.3*8* Checklist for Imputing for Item Nonresponse.) 
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GUIDELINES 

4.6.1. When data arc received (either on forms or electronically), receipt should be logged on computer 
or in a log book so that all data forms or files can be tracked from their date of receipt through 
data followup and final key entry. (Electronic files should contain both transmittal and run dates, 
as well as other critical information identifying the data provider.) 

4.6.2. All forms should be reviewed for completeness. (See 4*6.2. Checklist for Coding Data,) Forms 
with items that have not been completed or forms with out-of-range or invalid responses should 
go to the data collection staff for followup. When there are out-of -range or invalid responses, the 
data collection staff should receive an explanation of the reasons that the responses are 
unacceptable. 

4.6.3. Data forms from out-of-scope providers or forms that are still missing key items or a significant 
number of items after data followup should be evaluated for rejection in accordance with specified 
criteria. 

4.6.4. Whether data are entered into an automated system from data collection forms, electronically, or 
directly into a computer assisted telephone interviewing (CATI) system or a computer assisted 
personal interviewing (CAPI) system, data should be thoroughly checked for key-entry errors. 
Range checks and logic or consistency edits programmed into CATI or other automated systems 
can flag values that do not fall within the specified range of possible values; however, these 
checks and edits cannot detect values that are within range but still inaccurate. 

4.6J. In order to verify that data are being entered correctly, a reasonably sized sample or all of the 
forms should be keyed twice (by different staff). Differences between data files keyed from the 
same data collection forms should be resolved by referring to the original forms. 

4.6.6. When data are entered through machine-readable techniques (e.g., bar-coding, optical scanning), 
a sample set of data should be verified periodically to ensure accurate conversion. 

4.6.7. "After the data are edited, an error report or electronic file should be produced listing data that 
were flagged as invalid-wilh explanations for why they were flagged (i.e., which range check or 
logic edit has not been met). This report or electronic file should be reviewed, the errors checked 
against the original data collection forms, and corrections entered. 
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1. Determine in advance all acceptable responses to each closed-ended item on the data collection 
form and, if appropriate, assign numerical values or codes to each response. 

2. Assign consistent numerical values to responses so that a given response always has the same 
value and no other response has that value. Assign values or codes for blanks or missing data. 

3. Maintain complete and detailed codebooks with the following information: 

■ Columns occupied by each variable in the data file 

■ Variable names 

■ Values assigned to each variable 

4. Make multiple copies of the codebook and keep them in secure places. 

5. Stamp the date on incoming data collection forms when they are received at the project site and 
log them in on computer or in a log book. 

6. Review data collection forms as they are received to confirm that all necessary codes have been 
created. Add additional codes as required in order to eliminate the need for data entry staff to 
make decisions while entering data. 

7. Keep a log of all coding decisions to ensure that coded data can be interpreted in the future. 

8. Note on the data collection form the value assigned to each response unless codes are preprinted 
on the data form. 

9. Identify the coder of each form (e.g., write initials in corner) and note in the log book where or 
to whom the form is being sent (e.g., to data collection for followup or to data entry). 

10. Perform systematic quality control on coders' work for accuracy and intcrcoder reliability. 
Review a high percentage of work initially and move to a lower percentage as acceptable levels 
of accuracy are reached. Have a second person code a portion of each coder's forms. 
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4.7.1. Procedures should be developed for ensuring that data are properly protected at all phases of data 
preparation and processing. This holds true whether the data are being processed manually or by 
an automated system. (See 4.7.1. Checklist for Backing Up Programs and Data Files.) 
Procedures for automated systems should include backing up at each step, beginning with the 
initial data collection process and continuing through the final analysis of data and publication of 
reports. 

4.7.2. Data preparation and processing staff should be trained in the policies and procedures for proper 
backup of data files. Staff awareness should be maintained by periodically scheduling simulated 
tes. f disasters. 

4.7.3, Data processing staff should obtain information from other staff members about the requirements 
for maintaining historical data files or programs for audit purposes, Data files or programs that 
are used for allocating funds are often subject to specified retention periods. File archives should 
be considered for any systems that have a high volume of transaction processing. (See 4.7.3. 
Checklist for Retaining Programs and Data Files.) 

4.7.4, Data processing staff should meet with other project staff to identify applicable federal, state, and 
local laws and policies regarding the confidentiality of data and the limitations on access to those 
data by unauthorized persons, At the same time, data should be made avai'^ble to researchers and 
the public in keeping with applicable laws governing access to public records, 

4.7.5. Persons hired or recruited to process or otherwise work with data should meet applicable 
confidentiality and access guidelines. They should have a reasonanle disinterest in or be unfamiliar 
with the individuals or entities from whom the data are being collected, and they should 
participate in training and orientation programs designed to ensure that they understand, accept, 
and follow these and all related guidelines. 

4.7.6, Data processing staff should store data using procedures that minimize the potential for accidental 
loss or damage (e.g., appropriate labeling to reduce the likelihood that paper or film documents 
will be destroyed or misfiled) and that enable the staff to access or recover data within a 
reasonable time and at a reasonable cost. 
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;4&: Standard for Maintaining Programs and Data Files 



4.7.7. The melium used for storing data should be chosen based on the needs of users who will access 
the data and the size of the project. Storage media include: 

■ Original data collection forms (e.g., in a tile cabinet) 

■ Magnetic tape 

■ Microfiche/microfilm 



■ Hard disk 



■ Floppy disk 

■ CD ROM 

■ Optical disk 

4.7.8. Access methods should be chosen based on the number of persons who will access the data, the 
locations and processing capabilities ot these individuals, frequency of access, and requirements 
for simultaneous access. Data can be accessed through: 

■ Hard copy reports 

■ Direct connections to mainframe computer facilities 

■ Local/wide area network attachments 

■ Modem access to mainframe computers and network systems 

■ Stand-alone microcomputers 

Data can be accessed in the following tyj 3S of file structures: 

a Database management system 



Fiat file on a floppy disk 



Indexed file 



4.7.9. Staff should develop a priority list of functions that are critical to the data processing system and 
determine how these functions would be carried out within a reasonable time period if computer 
equipment went down or data files were lost. 
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4.7.10. The following items should be duplicated and stored offsite: 

■ A catalog of the automated system and a description of the application (with the name and 
telephone number of a contact person) 

■ A log with the location of backup software and program files and a priority list for 
restoration of service 

■ Telephone numbers of software maintenance providers 

■ Special setup instructions, if any, for configuring software 



RELATED STANDARDS 

6.3. Standard for Releasing Data 
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Develop a schedule that specifies when to create backup files. The backup schedule depends upon 
factors such as how often data files are updated, the consequences of losing data, and the cost of 
recovering lost data. 

Create a backup copy of each computer program every time a change is made in the program code 
to ensure that data can be reconstructed. 

Place identification numbers, including the date and time the last change was made, on each 
version of a computer progiam. Consider placing an identification flag on records to indicate 
which version of a program was used to process the data. The version flag can be used to select 
for regeneration records that have been processed by a faulty version of a computer program. 

Back up transaction files generated by data entry or interactive system logs and master files for 
possible system or file regeneration. 
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1. Determine by checking with other project staff if data files are subject to state or federal retention 
laws and, if so, comply with retention period requirements for file recovery or system restart 
procedures for automated or manual systems. 

2. Determine the number of records, files, and programs that need to be retained. 

3. Establish a procedure and schedule for storing historical copies of files or programs in a secure 
location. 

4. Determine the life expectancy of the storage media lhat are being used and schedule regeneration 
of copies prior to the expiration dates. 

5. Keep multiple copies of historical files as backups in case some files are corrupted or damaged 
during storage. 

6. Develop a file-naming convention to identify files stored at the off-site facility. 

7. Develop a plan for documenting the retention of data files and programs. Use a tape-labeling 
system to clearly identify the contents and the intended destruction dates (if any) of data files in 
order to prevent accidental erasure or destruction of files. 

8. Store hard copy documentation of historical files in at least two locations to project against loss 
or destruction. 

9. Develop a procedure for producing archive files after each transaction or master file update (or 
within a reasonable timeframe for a particular processing facility). 

10. Document archival procedures and include archival files in backup and disaster recovery 
procedures. 
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jpessing system components, development, modification a**d 
' " * **• system development — « **u^^<* 



GUIDELINES 

4.8.1. A complete set of documentation for the requirements, design, development, testing, and use of 
the data processing system should be kept in a central, sale location. A note should be made of 
the location of any documentation stored elsewhere. 

4.8.2. Systems and/or program planning and design activities should be documented. Typically, these 
documents consist of systems and program requirements, specifications, preliminary and detailed 
design documents, database specifications, program specifications, test plans, and test analysis 
reports. (See 4.8.2. Checklist for Writing Software Documentation.) 

4.8.3. Process descriptions should explain the purpose of the process, the approach, the process flow 
(including net inputs and outputs), and the staff skill levels required to perform the process. 

4.8.4. Hardware documentation should be adequate to permit the staff to use the equipment efficiently 
and effectively. Overly technical descriptions, however, should be avoided. The documentation 
should identify the hardware model, capabilities (main memory, speed, and compatibility with 
other hardware), operating system, operating instructions, communication requirements, and 
limitations. A list should be kept of those who use the documentation, and special maintenance 
requirements should be noted. 

4.8.5. User documentation should enable the staff to access and use computer programs and data files 
effectively. The documentation should be written in nontechnical language and should not include 
detailed technical descriptions of computer programs. (See 4.8.5. Checklist for Writing User 
Documentation.) 

4.8.6. A consistent format should be used to label and document tapes for release. 
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4.3* Standard for Documenting Data Processing Activities 

Checklist for Writing Software Docwmeutatioii ; : 

1. Provide a general information section in the software systems documentation that describes: 

■ The hardware environment 

■ Support software such as the operating system and compilers 

■ The programming languages 

■ Computer memory requirements 

■ Storage media such as magnetic tape or floppy diskette 

■ The system libraries and data-set naming conventions 

■ The sequence of software operations 
h System maintenance procedures 

2. Depict the sequence of operations, including input and output files and tables, in a system 
flowchart and specify the names and locations of program listings and file layouts. 

3. Document each discrete module separately: 

■ List all program inputs 

■ Describe the processing logic 

■ Define program variables 
* Explain algorithms 

■ Identify and describe unusual programming approaches 

■ Describe the program run procedures, including required parameters 

■ Identify and define output files 
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4. Provide comments as part of each computer program that are sufficient to enable a person other 
than the original programmer to modify the program with little or no difficulty. Include an 
introductory paragraph describing the purpose of the program and giving the date the program was 
created or modified. Introduce each major processing routine with a brief explanation of its 
purpose. 

5. Document program maintenance procedures for systems and programs that will be in use long 
enough to make modification likely. The documentation should include: 

■ A description of programming conventions 

■ Program testing and correction procedures 

■ Special maintenance requirements such as table updates or security provisions 

■ The location of program listings 

■ Typical program execution time 

■ Data-set naming conventions 

6. Cros: -reference and index all documentation. 
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4.8. Standard for Documenting Data Processing Activities 


48.5. Checklist for Writing User Documentation 





1. Include the following information in program documentation for users; 

■ Intended audience 

■ Program name 

■ Location (appropriate library or directory reference) 

b Run instructioas, including parameters required for successful execution 

■ Typical run time 

■ Identification and description of files, reports, and other program outputs 

2. Specify the following: 

■ Directory or library name 

■ Data set name 

■ Recording medium (e.g., tape, disk) 

■ Record length 

■ File size in the data file documentation 



3. 



5 



List each data element by its mnemonic name and meaning, starling position on the file length 
in characters, and data type (alpha or numeric). 



Use a comments field where appropriate to provide additional information about a data elemeni 
This information -.ay include the element's permissible values and meanings, the composition of 
derived elements, and the format of specific elements such as dates. 

Make sample output formats available for reference, training, and comparisons to actual output. 
6. Walk through documentation with potential program users and modify as necessary. 
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49. Standard for Evaluating Data Processing Systems 



— * 



PURPOSE: To ensure that data processing systems funcnon as intended. 



GUIDELINES 

4.9. 1. Data producers should periodically review the evaluation criteria and methods developed during 
the Systems Design Stage. 

492 Evaluation of automated systems should focus on the accuracy of the data, adequacy of the 
controls and security procedures, performance of hardware and software, and the degree to wmch 
the expected outcomes and costs are attained. 

49 3 A procedure should be developed to evaluate the handling of problems encountered during 
automated processing. (See 4.9.3. Checklist for Documenting Data Processing Problem 
Resolution.) The procedure may be manual or automated depending on the complexity ot the 
system and the volume of transactions. 

4 9 4 Evaluation of data preparation should address the overall flow of data collection forms through 
the system, as wel! as the percentage of forms verified and the accuracy rates for coding and key 
entry. 

4 9 5 Data processing staff should meet oeriodically with staff from other phases of the data collection 
" " ' activity to evaluate how well systems requirements are being met and the extent to which expected 
outcomes are being achieved. 



RELATED STANDARDS AND CHECKLISTS 

4.1. Standard for Planning for Systems Requirements 

4.2. Standard for Designing Data Processing Systems 

4.3. Standard for Developing Data Processing Systems 

4.6. Standard for Preparing Data for Processing and Analysis 




4. Data Preparation and Processing 




L Maintain a problem documentation log to record automated system malfunctions. 

2. Design problem-handling procedures that require a user or operator to complete a problem report 
form or that specify an automated report file that catalogs problem reports generated by either the 
computer system or a system operator. 

3. Include sufficient information about a problem to enable an analyst, computer programmer, or 
system operator to follow up on the problem. Such information may include: 

■ Name of individual discovering the problem 

■ Program or procedure name 

■ Job step where program problem is found 

■ Program address within job step 

■ Names of files being processed 

■ Date and time problem occurred 

■ Description of problem 

■ Copies of error messages or listings 

■ Suggested resolution 

■ Action taken 

4. Make problem documentation available to systems programmers, operators, and data users. This 
documentation is particularly important to data users because problems encountered during 
processing may affect the integrity of the data or suggest problems with the data collection 
methodology. 

5. Incorporate problem reporting procedures into the Systems Development Stage in order to ensure 
effective communication among users, programmers, and operators. 
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PHASE 5. DATA ANALYSIS 



5.0. Introduction to Data Analysis 

5.1. Preparing an Analysis Plan 

5.2. Developing Analysis Variables 
53. Applying Appropriate Weights 

5.4. Estimating Sampling and Nonsampling Errors 

5.5. Determining Statistical Significance 
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5.0. Introduction to Data Analysis 



Data analysis is the process by which information generated by the design, data collection, and 
data processing is brought together and used to answer the original study questions. The precise sequence 
of steps involved and the particular analytic techniques employed may vary greatly depending on a study's 
purpose and design. For example, some studies require weighting of data and attention to sampling errors- 
mothers do not. Thus, different groups of standards included in this phase apply to different types of 
analyses. 

The decision made during the design phase about whether to use a sample or universe of the 
population being studied determines, in part, the types of analyses that will be required and the standards 
that will apply. For example, standards for dealing with sampling errors and significance tests do not 
apply to most universe studies. By definition, the use of a universe precludes the possibility of sampling 
errors (because the entire population, not a representative sample, is being studied). Weighting of data 
for universe studies is done only to compensate for unit nonresponse. When data are collected from a 
sample of a population, provisions are needed for sampling errors, significance tests, and weighting of data 
to make generalizations about the entire population. 

Although many of the standards included in this phase arc geared specifically toward analysis of 
survey data, the principles apply to all types of data analyses. Certain basic issues and questions (e.g., 
What cautions must be observed? What types of remarks should be avoided?) need to be raised regardless 
of the type of analysis. 

Words in italics are defined in the glossary. 




5. Data Analysis 



5X Standard for Preparing an Analysis Platti||lll^ 


: ' ■ • 


■ • • ■ 

yyyy.-'. 


PURPOSE* To ensure that the proposed analyses will address {he study quesitous a 


M purposes. 





GUIDELINES 



5.1.1. In most data collection activities, formal, written analysis plans should be prepared at three points: 

■ A preliminary analysis plan should be developed during the design phase (a) as a check on 
the completeness, adequacy, and internal consistency 0 f the study questions, methodology, 
sample design, and measurement instruments, and (b) as a guide for planning of data 
collection and data processing activities. 

■ At expanded analysis plan should be prepared after the design has been finalized and data 
collection has begun. The expanded plan should incorporate any significant design or 
implementation changes that occurred after the preliminary plan was developed and should 
expand upon the preliminary plan by providing detailed specifications for the development 
and statistical treatment of all analysis variables. 

■ A final analysis plan should be prepared after the data processing has been completed and 
the analysis variables have been constructed and documented. The final plan should present 
the results of the analysis variable development and should incorporate any changes in the 
substantive analyses required by data problems or other unexpected discoveries at the variable 
development stage. 

5.1. 2. At each stage of development (preliminary, expanded, final), written analysis plans should be 
organized around the study questions. For each question, the plan should specify: 

« The general form of the comparisons or tabulations that will be performed to answer the 
question, including the unit of analysis (e.g., student, school, district), the analytic techniques 
to be employed, and the anticipated sample sizes that will be involved. 

■ The variaf.es that will be used in the analysis. 

■ How each of the analysis variables will be derived-what source data will be used (e.g., what 
items on the data collection instrument, what achievement tests) and how the variable will 
be constructed and evaluated prior to use in substantive analyses. 

■ (Expanded and final plans only): The specific statistical analyses that will be performed 
(e.g., the table shells to be used in crosstabulations; the specific tests of statistical significance 
of time trends or group differences), including the software to be used. 
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Continued 




5.1.3. Since analysis plans constitute a blueprint and a guide for the anticipated future direction of the 
study, they should be shared with and reviewed by all pertinent members of the data production 
team and, as appropriate, with data requestors, outside advisors, and data users who may be able 
to offer constructive suggestions. 



RELATED standards 

2.1. Standard for Formulating and Refining Study Questions 

5.2. Standard for Developing Analysis Variables 




5. Data Analysis 




GUIDELINES 

5.2.1. Response rate information (number of cases sampled, number of responses obtained, and response 
rate) should be presented for all analysis variables in ouLr to evaluate possible bias due to 
nonresponse. For complex variables created from several data elements, response rate information 
should be presented for the component elements of the variable. 

5.2.2. Missing values should be imputed for all analysis variables, using a method of imputation that is 
unlikely to bias the results. (See 5.3.8. Checklist for Imputing for Item Nonresponse.) 

5.2.3. For categorical variables (school control, student ethnicity, etc), all of the categories should be 
fully defined, and the number of cases in each category should be reported. When samples are 
involved, both sample sizes (unweighted frequencies) and estimated population totals (weighted 
frequencies) should be presented for all categories in order to evaluate the adequacy of sample 
sizes for individual categories and to verify the absence of out-of~range values or other anomalies. 

5.2.4. For continuous variables (e.g., enrollment size, family income), basic response distribution 
information should be provided (e.g., score range, median, mean, and standard deviation) for the 
population represented in the study and for all subpopulations being compared on the variable, in 
order to evaluate the suitability or limitations for use in further analyses. 

5.2.5. For complex or judgmental variables where response consistency is a particular concern (e.g., 
student test scores, teacher ratings of student aptitudes, interviewer ratings of respondent attitudes), 
quantitative indices of response reliability should be obtained and reported. The design should 
provide for the collection and computation of appropriate reliability indices, unless adequate 
reliability information is available in the literature. To be adequate, secondary source reliability 
information should have been recent, should have been replicated (i.e., more than one previous 
study should be cited), and should involve measures and populations identical or very similar to 
those in the current data collection activity. 



EMC 



5-5 

98 



Continued 



Standards for Education Data Collection and Reporting 




5.2.6. For complex or judgmental variables where there is a particular concern about whether or how 
well the variable actually measures what is intended (e.g., student test scores, teacher ratings of 
student aptitudes, interviewer ratings of respondent attitudes), documenta* )n of adequate response 
validity should be obtained and reported, The study design should pre Adc for the collection 
and/or the computation of appropriate validity evidence, unless adequate validity information is 
available in the literature, To be adequate, secondary source validity information should be recent, 
should have been replicated (i.e., more than one previous study should be cited), and should 
involve measures, purposes, and populations that are identical or very similar to those in the 
current data collection activity. 



RELATED STANDARDS 

2.5. Standard for Transforming Study Question Concepts into Measures 

5.4, Standard for Estimating Sampling and Nonsampling Errors 
6.1. Standard for Presenting Findings 

6.5. Standard for Preparing Documentation and Technical Reports 
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ft' Standard for Applying Appropriate Weigilj 



population from which; $1^^ 



GUIDELINES 

5.3.1. 



5.3.3. 
5.3.4. 



Data should be weighted when a sample is selected randomly with known probabilities of 
selection at each stage of sampling-eilher when different sampling rates are used for subgroups 
of the population or when aggregate population totals are needed. The use of weighted estimates 
should be agreed to ai the onset of the design process in order to allow adequate planning time. 

5.3.2. Analyses should employ estimates of sampling errors and confidence intervals that are appropriate 
to the design and procedures used. 

A recordkeeping system should be adopted that will guarantee that all information required for 
weighting will be available when needed. 

A data file should contain each component weight ar,d a base weight for all respondents and 
nonrespondents. The component weight for each stage of sampling is equal to the reciprocal of 
the probability of selecting the unit at that stage. The base weight is a composite of all the 
component weights, and it accounts for ail the stages of sampling of units. 

If the unit response rate is less than 100 percent, adjusting the base weights for nonreswme to 
reduce bias should be considered. (See 5.3.5. Checklist for Weighting for Unit Nonresponse ) 
Nonresponse bus occurs when respondents differ as a group from nonrespondents on a survey 
item. v For example, if respondents have a higher dropout rate than nonrespondents, an estimated 
dropout rate based only on respondent data, unadjusted for nonresponse, will be biased upward- 
the actual dropout rate of the survey population will be overestimated.) 

5.3.6. If independent and accurate estimates of population totals are available from an outside source 
using poststratification adjustments to reduce bias and variance should be considered' 
Poststrat.f.eation adjustments are weight adjustments which force survey estimates to match 
independent population totals with selected poststrata. (See 5.3.6. Checklist for Makine 
Poststratification Adjustments.) h 

The final adjusted weight should be used for making estimations. The final adjusted weight is 
a composite of all base weight components, nonresponse adjustments, and the poststratification 
tactor, it any. 



5.3.5. 



5.3.7. 



5.3.8. 



If the item response rate is less than 100 percent, imputing for missing values or weighting should 
be considered. (See 5.3.8. Checklist for Imputing for Item Nonresponse.) 
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1 If there is any unit nonresponse, assume that respondents differ from nonrespondents and that 
nonresponse bias exists. Make weighting adjustments for unit nonresponse to reduce nonresponse 
bias. Calculate variances of weighted estimates. 

2 In preparing to weight for unit nonresponse, define subclasses of units so that data providers and 
nonproviders within a subclass are as similar as possible with respect to the variable of interest 
The similarity between data providers and nonproviders is increased when the charactensttc used 
to define the subclass is strongly related to the variable being analyzed (e.g., in a school-wide 
student achievement test, subclasses could be defined by student age and grades). 

3 The variables used in sampling define one type of subclass often used for adjusting for unit 
nonresponse because these variables are usually defined in relation to the study variables of 
greatest interest. To adjust for nonresponse within a subclass, multiply the responses (inflated by 
sampling weights) by the nonresponse adjustment (the ratio of the selected sample size to the 
number of data providers). 

4 Poststratification is recommended for adjusting for unit nonresponse if accurate and usable 
population data are available from a source outside of the studj . (See 5.3.6. Checklist for 
Making Poststratification Adjustments.) 

5 Make sure each subclass defined for weighting contains at least 20 units; if it does not, the 
variance of the estimate from that subclass will be unduly increased. If a subclass has fewer than 
20 units, combine it with another similar subclass having relatively few data providers. (Note that 
although increasing the number of subclasses tends to reduce nonresponse bias, there is little 
benefit in using more than four or five classifying variables to define subclasses.) 

6 Examine the ratio of the largest nonresponse weight to the smallest. If the ratio exceeds three to 
one, consider collapsing cells to keep from unduly increasing the variance of estimates. 
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53.6. Checklist for Making ^^ratification Adjustments 



4. 
5. 

6. 



Use analysis variables available from the independent estimates of the population to define groups 
of units (adjustment cells) for the population. K P 

Be careful not to define cells so precisely that the representation of the sample in any cell is too 
sporsc, 

Obtain independent totals for each cell once the adjustment cells have been defined. 
Use the nonresponse-adjusted base weights to make study estimates for the same cells. 
fach^ceil 116 indeP ° ndent eS,imate by the Stud y estimate t0 create poststratification factor for 

Examine the poststratification factors for extreme values. If extreme values are found, determine 
whether sample estimates correspond with independent estimates. Reduce extreme values for 
example, by collarsmg cells so that the sample size in the cells is large enough (i.e., greater 'than 



7. 
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1 Compare item respondents with item nonrespondents on the basis of other survey variables 
correlated with the item in question. Imputation may not be necessary if the differences are 
minor. 

2 If appropriate unit-level data are available fro- external sources, such as administrative records, 
consider substituting them for the missing values. Such data, however, are often unavailable or 
are in a format that does not make them useful for imputation. 

3. Several imputation methods are available. The one that is most appropriate depends on the 
intended analysis: 

m If the goal of the analysis is to produce statistical estimates of aggregates, such as population 
totals, consider substituting the mean of respondents within a subclass for the missing values 
in the subclass. This method keeps the variance of the estimate relatively small. 

■ If the goal of the analysis is co study relationships among variables, consider forming 
subclasses correlated with the variables of interest and then duplicating randomly selected 
case values within the subclasses. Ensure that the number of values duplicated equals the 
number of missing values. A method often used instead of random selection is the selection 
of the next case in the computer file after the missing value (once the file is sorted in order 
by the subclasses). This is referred to as "Iwt deck" imputation. 

■ If the research involves analysis of a few variables of critical importance, and if time and 
resources permit, regression or other multivariate analyses may be used to impute missing 
values. Study variables that best predict the critical values of respondents can then be used 
to generate values for nonrespondents. This method can minimize the bias of population 
estimates. Use caution, however, in any subsequent multivariate analyses of these types of 
imputed variables, especially if they are used as outcomes, because an artificial relationship 
between the outcome and explanatory variables has already been built in. 

4. Hag imputations on computer files. 
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GUIDELINES 

5.4.1. Methods should be determined for computing sampling errors that are appropriate given the 
procedures that will be used to select the sample. (If commercially available software packages 
are used, analysts should ensure that default methods are appropriate. If not, analysts should 
develop procedures for computing sampling errors early in the project.) 

5.4.2. If replication techniques are used for estimating sampling errors, replicates should b ; formed at 
the time of sample selection. Replicate weights should be developed simultaneously with the 
ordinary full-sample weights used for estimation. Nonresponse adjustments, poststratification, and 
any other weight adjustments should be recomputed for replicate weights using the same methods 
used for the full-sample weights. 

5.4.3. Modelling the sampling errors (e.g., calculating an average design effect or using regression to fit 
a curve) should be considered as an alternative to producing a separate estimate of sampling error 
for every published statistic. Modelling can be done for different subclasses of the population. 

5.4.4. Major sources of nonsampling error should be investigated, and rules should be developed for 
dealing with incomplete or inconsistent data (e.g., under what circumstances will data providers' 
responses be included or excluded from some or all analyses). 

5.4.5. Error rates in processing (scanning and data transcription) should be quantified. 

5.4.6. Descriptive statistics for each of the variables and important combinations of variables should be 
examined to detect outliers (e.g., minimum and maximum values, mean, and variance). A policy 
dealing with outliers should be formulated. 

5.4.7. Reasonableness and consistency checks should be performed throughout the data analysis process. 
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Continued 




5.4.8. Comparisons of estimates from the data collection activity with estimates from reliable 
independent sources should be considered for key variables. 

RELATED standards 

6.1. Standard for Presenting Findings 

6.5. Standard for Preparing Documentation and Technical Reports 
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h ensum that statistical analyses can be properly interpreted. 















GUIDELINES 

5.5 J. Confidence intervals should be included for all key statistics. 

5.5.2. Significance levels for the results of statistical tests should be specified before inferences are 
made. The probability of type I and type II errors should be considered when conducting 
significance tests. 

5.5.3. Selection of levels of statistical significance should be based upon substantive factors in the data 
collection activities. For example, when the consequences of being wrong are severe, a higher 
level of significance should be used. 

5.5.4. Because not all statistically significant results have practical significance, the practical significance 
of statistical comparisons should be described. 

5.5.5. For analyses designed to examine issues for which specific hypotheses were not prescribed prior 
to looking at the data, it should be noted that significance levels and confidence intervals are 
merely indicators of the potential existence of relationships. 

5.5.6. For analyses designed to test hypotheses that were developed prior to reviewing the data, errors 
associated with making several simple comparisons simultaneously should be addressed. Multiple 
comparison procedures should be used to reduce the probability of making incorrect statements. 

RELATED STANDARDS AND CHECKLISTS 
6.1. Standard lor Presenting Findings 

6.5. Standard for Preparing Documentation and Technical Reports 




PHASE 6. REPORTING AND DISSEMINATION OF DATA 



6.0. Introduction to Reporting and Dissemination of Data 

6.1. Presenting Findings 

6.2. Reviewing the Report 

6.3. Releasing Data 

6.4. Disseminating Data 

6.5. Preparing Documentation and Technical Reports 




6.0. Introduction to Reporting and Dissemination of Data 



Most data collection and analysis efforts culminate in one or more reports on the findings. The 
standards included in this phase are designed to ensure that reports are prepared, documented, and 
reviewed in a manner that enhances their accuracy, credibility, and usefulness. The standards also address 
the release and dissemination of data. When databases are accessible to the public, the relevant standards- 
particularly those related to confidentiality-should be considered. 

The standards for reporting and dissemination of data make a distinction between substantive 
reports that describe study findings and technical reports that document study procedures Most 
substantive reports, however, contain some methodological information; thus, many of the standards for 
technical reports apply to substantive reports as well. For the purposes of these standards, a technical 
report includes comprehensive documentation and evaluation of data collection, processing, and analysis 
procedures. ' 

When preparing a report for a professional journal, a private organization, or a government agency 
those who use these standards should determine if there are additional guidelines that must be followed! 

Words in italics are defined in the glossary. 
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PU$ME* To ensure that results • are 
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GUIDELINES 

6.1.1. Planning for a report on the results of data collection activities should take place during the design 
phase of the project. (See 6.1.1. Checklist for Planning the Report.) 

6.1.2. Reports should include major sections on the following topics: background and purpose, study 
questions and hypotheses, methodology (design, data collection, processing, and analyses), 
findings, conclusions, and recommendations for further investigation. 

6.1.3. Reports should include a summary of the results, with emphasis on key findings and conclusions 
(e.g., executive summary). 

6.1.4. Reports should describe the rigor of the methodology and quality control procedures used in the 
data collection activities--as well as the validity, reliability, and context of the data-to enable 
readers to judge the credibility of the findings. (See 6.L4. Checklist for Describing the Rigor 
of the Methodology and Quality Control Procedures.) 

6.1.5. Reports should be complete and concise, focusing on the topic at hand and addressing only issues 
that relate directly to the data being reported. (See 6.1.5. Checklist for Developing a Complete, 
Concise, and Focused Report.) 

6.1.6. Information should be presented in a manner that is appropriate for the intended audiences, 
without jeopardizing accuracy through oversimplification. (See 6.L6. Checklist for Presenting 
Findings in a Manner that is Appropriate for the Intended Audiences.) 

6.1.7. Tables and graphs should be accurate, complete, and easy to interpret. Each table and graph 
should be able to stand alone. (See 6.1.7. Checklist for Presenting Data in Tables and 
Graphs.) 

6.1.8. Reports should be organized and written with a clear understanding of the potential impact they 
may have. Authors should be aware of the range of actions available to the audiences. 

■ Reports should clearly identify the year or other time period the data cover and distinguish 
between that time period and the year in which the report was released. 

■ Reports on studies should make a clear distinction between research findings and the policy 
implications that may be inferred from these findings. 
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6.1.9. Reports should be prepared in a timely fashion to ensure that they have an impact on decision 
that the information could plausibly affect. 

RELATED STANDARDS 



5.5. Standard for Determining Statistical Significance 
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6, REPORTING AND DISSEMINATION OF DATA 

6.1. Standard for Presenting Findings 
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MX ChtMist for banning the Report - • ; ^ 

The plan for producing a report should include the following steps: 

1. Ensure that adequate resources are available for preparing and disseminating the report 

2. Ensure that realistic timeframes are set for producing the report 

3. Identify intended audiences 

4. Determine the audiences' information needs 

5. Assess the audiences' level of technical knowledge 

6. Identify the appropriate media for presenting findings to the intended audiences 
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6.1. Standard for Presenting Findings 



«*4 Checklist for Besci^l^ tjfir 



Methodology and Qeality Control 



l. 

2. 
3. 
4. 



Review the report to ensure that it is free of bias in content, style, and tone. 

Establish credibility by providing a balanced presentation of differing perspectives and a fair and 
impartial reporting of all data. 

Present study results in a manner that enables the reader to distinguish clearly between objective 
findings and opinions, judgments, and speculations. 

Describe potential data limitations and discuss how the data and findings may and may not be 
used. This discussion should address the degree to which the study findings and inferences can 
be generalized based upon sampling and other areas of the design and methodology. 
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6.1. Standard for Presenting Findings 



Checklist for Developing a Complete, Concise, and Focused Report 

m nfti^iYiYitwwffl ;;v;-;v;w;-^; : ; : ;w, : . , ,•■ . •■ . ■• : wuv-v t , ,,,,,, . .. . ....... 

Provide sufficient depth and breadth to give a full and accurate description of the data collection 
activities and the conclusions that were drawn. Present conclusions within the context of the 
strengths and weaknesses of the data and methodology. 

Clearly identify the types of conclusions that can be supported by the findings. In particular, 
caution readers not to infer causality from descriptive findings and not to interpret correlation as 
proof of a cause-and-effect relationship. 

To assist readers in interpreting the data, describe how the data collection activity compares with 
others with respect to theory, methodology, limitations, and findings. 

Where appropriate, propose areas for additional study and relate the findings and conclusions to 
other pertinent research and relevant information. 




Standards for Education Data Collection and reporting 



6.1. Standard for Presenting Findings 




1. Consider producing separate reports for selected audiences. 

2. Make reports prepared for the public attractive and interesting as well as technically accurate. 

3. Write in straightforward, nontechnical language to the degree that the subject matter permits. 
Jargon, regional terms, and the like should be avoided. 

4. If diverse audiences are expected to read a report, use subheadings and summary data to assist the 
various audiences in locating salient information. 

5 Consider using a variety of methods for communicating information about the data collection 
activities. Brochures, fact sheets, videotapes, and slides may be used in addition to or in place 
of the traditional narrative report. 



6-8 

1 1 4 



■El 



6. Reporting and Dissemination of Data 




1. Gearly label the data shown in tables and graphs. 

2. Explain tables and graphs in the text of the report. 

3. Number or letter tables and graphs to correspond to explanations in the text. 

4. Clearly label axes to indicate categories and variables. 

5. If data from sources other than the report are presented in tables and graphs, these sources should 
be clearly cited. 

6. Design all graphs and charts (including proportions chosen) to ensure that they accurately and 
visually reflect the data being presented. 

7. Include presence-of-error information (e.g., standard errors, frequency of missing data) in tables 
and graphs or in additional cross-referenced tables. 

8. Use footnotes or other annotations for clarification when rounded totals are inconsistent with 
unrounded displayed data (e.g., figures in columns or rows do not add up to the rounded totals). 

9. When peicentages or rates are presented in tabular form, provide the actual number of units on 
which the percentages or rates are based, 

10. Include standard errors or confidence intervals on statistics calculated on data presented in tables. 
Include this information in the table being displayed or in the text of the report. 




Standards for Education Data Collection and Reporting 



Standard for Reviewing the Report 



PURPOSE: To ensured and useful 



GUIDELINES 

6.2.1. Before a report is prepared, a review plan should be developed that includes the following 
information: 

■ How reviewers will be selected 

■ Resources needed for performing the review 

■ Timelines for completing the review and making subsequent revisions 

■ Adjudication procedures for responding to reviewers' critiques 

■ Sign off by reviewers 

■ How respondents and other affected parties will have the opportunity to assess the accuracy 
of data and appropriateness of analysis 

6.2.2. Reports should be reviewed by experts in the subject matter and in relevant areas of design and 
methodology, by representatives of the intended audiences, and, where appropriate, by the 
reporting agency's information officers. 

6.2.3. Reviews should address all areas that may affect the quality and usefulness of the report to ensure 
that: 

Background and purpose of the study are described 

Target population of the data collection is clearly identified 

Scope of the collected data is appropriate to the purpose of the study 

Methods and procedures used to collect the data and the timeframe of the collection are 
described 

The data collection instrument is appended to the report, when possible 
Sampling methods and size are described, when applicable 
Data processing and quality control procedures are detailed 



Continued 
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Continued 
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Standard for Reviewing the mM. 



■ Methods of analyzing the data are described 

■ Errors, biases, omissions, and limitations in the data collection processes and analyses are 
described 

■ Conclusions, interpretations, and recommendations are consistent with the nature of the data 
and their analyses 

■ Formal and presentation of the data are useful and understandable for the intended audiences 
and uses 

■ All graphs and tables used in the report present the data in a straightforward manner, without 
distortion 

■ Distinctions are mad between differences that are statistically significant and those that are 
not 

6.2.4. The review process should adhere to the planned schedule to ensure the timely release of data. 




Standardsi™ 



1 





PURPOSE: To ensure the maximum feasible access to data while safeguarding confidentiality and 
Individual privacy rights. 



GUIDELINES 

6.3.1. Data should be released in a carefully planned and systematic manner that provides for full 
disclosure while protecting the confidentiality and rights of data providers and ensuring the timely 
receipt of dala by all affected parties. 

6.3.2. Data providers who might be affected by the release of data should receive special notification 
of the date and method of release. When possible, they should receive advance copies. 

6.3.3. In cases where data providers will be identified, they should have an opportunity to verify the 
accuracy of the data to be released and should receive copies of the data if desired and not 
otherwise prohibited. 

6.3.4. The identity of data providers, if confidential, should be protected throughout the data 
dissemination process. Particular attention should be paid to confidentiality when data are released 
through electronic means beiause of the increased potential for accidental disclosure. 

6.3.5. Distribution of draft reports and initial data should be carefully monitored and controlled. 

6.3.6. Procedures for release of data should be reviewed to ensure compliance with all federal, state, and 
local statutes, rules, and regulations that apply to the release of data. 

6.3.7. All identified audiences should be informed (in advance, if possible) about the procedures and 
schedule for releasing data and about the availability and distribution of any related documents 
and/or data tapes. 
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to. Standard for Dissemtnatitig Data 
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^URPOSE: To ensure that information reaches intended audiences in a useful format and withir 
teasonabM ; time^ 





GUIDELINES 

6.4.1. A dissemination plan and schedule should be formulated during the design phase. 

6.4.2. A dissemination method should be selected which ensures that information reaches the intended 
audiences in a timely manner. Multiple methods of dissemination are often desirable. 



6.4.3. Dissemination schedules should adhere to pre-established timetables to ensure that the 
users obtain the data on schedule. 



primary 



6.4.4. Those involved in data dissemination should be aware of the school-year cycle and legislative or 
other policy formulation schedules. They should attempt to release and disseminate data during 
the time when the information will be most useful. 

6.4.5. The timing of data dissemination for recurring data collection activities should be consistent from 
one year to the next. 

6.4.6. Dissemination methods and procedures should be designed to maximize appropriate use of the 
data. 

6.4.7. News releases or announcements should be prepared to encourage appropriate use of the data. 
When appropriate, authorities on the data collection activities should brief the news media in 
person. 

6.4.8. Each product to be disseminated (e.g., reports, tabulations, news releases) should clearly state the 
time period covered by the data collection so that this time period is not confused with the release 
date of the data. 

6.4.9. Provision should be made for timely responses to inquiries regarding the data. Appropriate 
contact names, telephone numbers, and mailing addresses should be clearly and prominently 
included in reports and other materials. 
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6.5. Standard for Preparing Documentation and Technical Reports 



PURPOSE? ^ 

to permit them to be evaluated and replicated. 



GUIDELINES 

6.5.1. Technical reports should describe the original design, including the data collection, processing, 
and analysis procedures. 

6 5 2 Technical reports should also describe the actual procedures used to collect, process, and analyze 
data. Reports should describe all procedures used to ensure that the data were collected in a 
uniform or consistent fashion from each sampled unit. 

6.5.3. Discrepancies between the design and the actual procedures used to collect, process, and analyze 
the data should be described and justified in technical reports. 

6.5.4. Reliability and validity estimates for all variables included in the data collection activities should 
be presented in technical reports. 

65 5 Technical reports should document all procedures employed to ensure that the rights, welfare, 
dignity, and worth of individuals data providers have been protected in a manner that is consistent 
with the assurances given them and ethical practice. 

6 5 6. Technical reports should document procedures used to obtain independent review of the data 
collection activities and to secure informed consent of the affected parties when adverse effects 
or risks are involved. 

6.5.7. Procedures used to ensure that data were collected in a manner that produced minimal disruption 
for individuals, programs, and organizations should be documented. 

6.5.8. All safeguards used to protect the data from distortions and the biases of data collectors and 
analysts should be documented. 

659 Technical reports should describe the procedures used to operationalize each variable. For 
example, reports should explain how missing data were treated and how differentiated item 
weighting was employed. Reports should also address the procedures for "reversing" items and 
note whether items were combined to create new variables or to produce index or scale scores. 

6.5.10. Technical reports should describe the procedures used to quantify or code and enter the data. 
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6.5* Standard for Preparing Documentation and Tecliiiical Reports 
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6.5.11. Technical reports should describe in detail all procedures used to store raw and processed data- 
including interview protocols, training procedures, coding manuals, archived dictionaries, and the 
like. 

6.5.12. Documentation of analytic procedures should be sufficiently detailed to permit someone with 
access to ihe data to replicate the results. 

6.5.13. When utilized, procedures for rounding figures should be made explicit in the report, as should 
decisions about the appropriate number of significant digits to be reported. 



RELATED STANDARDS 

2.9. Standard for Preparing a Written Design 

3.3. Standard for Ethical Treatment of Data Providers 

6.2. Standard for Reviewing the Report 
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Standards for Education data Collection and Reporting 



Appendix A. Related Standards 



The following standards were used as resource materials throughout the development of the 
Standards. 

1. American Educational Research Association, American Psychological Association, and 
National Council on Measurement in Education. 1985. Standards for Educational and 
Psychological Testing. Washington, D.C. 

2. U.S. Department of Energy. Office of Statistical Standards. Energy Information 
Administration. 1989. Standards Manual Washington, D.C. 

3. Evaluation Research Society. 1982. Standards for Evaluation Practice. 

4. General Accounting Office. 1988. Government Auditing Standards. Washington, D.C. 

5. Joint Committee on Standards for Educational Evaluation. 1981. Standards for 
Evaluations of Educational Programs, Projects, and Materials. New York, N.Y. 

6. U.S. Department of Education. National Center for Education Statistics. 1987. 
Standards and Policies. Washington, D.C. 

7. U.S. Department of Health and Human Services. Public Health Service. Office of Health 
Research, Statistics, and Technology. National Center for Health Statistics. 1980. Draft 
Guidelines for Statistics and Information on Effects of the Environment on Health. 
Washington, D.C. 

The following table indicates which of the above sets of standards may be referred to for 
additional information concerning the issues addressed by each of the standards in this document. 
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AER/ , APA, 

NCME, 
Standards for 
Educational 

mid 
Psychologica 

(1985) 


EIA> 
Statidards 
Manual 
'1989^ 


ERS, 
Standards 

for 
Evaluation 
Practtcc 
(1982) 


GAO, 
Government 
Auditing 
Standards 
(1988) 


Joint 
Committee, 
Standards for 
Evaluation of 
Educational 
Programs, 
Projects, and 

M alt* t\ti\ k 

(1981) 


NCES, 
Standards 
and Policies 
(1987) 


NCHS, Draft 
Guidelines 
for Statistics 

and 
Information 
on Effects of 
the 

Fti virnti hitynt 

M-tl% Vir C/f Iff 1 I- f 1 » 

on Health 
(1980) 


1.1. Creating an Infrastructure 
(o Manage Data 
Collection Activities 










■ 






1.2. Justifying Data Collection 
Activities 






■ 


■ 


■ 


■ 


■ 


1.3. Fostering Commitment of 
All Participants 






■ 




■ 


■ 


■ 


1.4. Creating an Appropriate 
Management Process 






a 






■ 


■ 


2.1. Formulating and Refining 
Study Questions 
















2.2. Choosing the Data 
Collection Methods 












■ 


■ 


2.3. Developing a Sampling 
Plan 












■ 




2.4. Assessing the Value of 
Obtainable Data 
















2.5. Transforming Study 
Question Concepts into 
Measures 






■ 




■ 






2.6. Designing the Data 
Collection Instrument 




m 


■ 




■ 


■ 


m 


2.7. Minimizing Toul Study 
Error (Sampling and 
Nonsampling) 












■ 




2.8. Reviewing and Pretesting 
Data Collection 
Instruments, Forms, and 
Procedures 


■ 










■ 


m 


2.9. Preparing a Written 
Design 










■ 


■ 




3.1. Preparing for Data 
Collection 


■ 


m 


■ 






■ 




3.2. Selecting and Training 
Data Collection Staff 


■ 




■ 


■ 


■ 






3.3. Ethical Treatment of Data 
Providers 


■ 




■ 




■ 


■ 




3.4. Minimizing Burden and 
Non response 


■ 


m 


■ 




■ 


■ 


m 


3.5. Implementing Data 

Collection Quality Control 
Procedures 




■ 




■ 


■ 




m 


3.6. Documenting Data 
Collections 




■ 


■ 






• 




4.1. Planning Systems 
Requirements 




M 
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AERA, APA, 

NCME, 
Standards for 
Education al 
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Psychologica 
1 Testing 
(1985) 


ElA, 
Standards 
Manual 
(1989) 


ERS, 
Standards 

for 
Evaluation 
Practice 
(1982) 


GAO, 
Government 
Auditing 
Standards 
(1988) 


Joint 
Committee, 
Standards for 
Evaluation of 
FAiAC/itlfiti nl 
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Programs, 
Projects, and 
Materials 
(1981) 


NCES, 

C//1/1 An* A* 

and Policies 
(1987) 


UCHS, Draft I 
Guidelines 1 
/or Statistics 
and 

IHMMH/l/i/)M 

1 njvrfnutivn 

on Effects of 
the 

Environment 
on Health 

/t noA\ 

(1980) 


4.2. Designing DaU Processing 
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4.3. Developing DaU 
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4.4. Testing DaU Processing 
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4.5. Planning for DaU 
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4.6. Preparing DaU for 
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DaU Files 
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Processing Activities 
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4 9 Evaluating DaU 
Processing Systems 
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5 1 Prpmrina an Analysis Plan 
















5.2. Developing Analysis 
Variables 
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Weights 
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5.4. Estimating Sampling and 
Nonsampling Errors 




■ 
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5.5. Determining SUtistical 
Significance 






■ 
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6.2. Reviewing the Report 
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6.3. Releasing DaU 
















6.4. Disseminating DaU 
















6.5. Preparing DocumenUtion 
| and Technical Reports 
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Appendix B. Other Related Standards 

The following sets of standards were referred to for specific topics covered by one or more phase. 

1. General Accounting Office. 1986. Developing and Using Questionnaires. Washington, 
D.C. 

2. Joint Committee on Testing Practices. 1988. Code of Fair Testing Practices in 
Education. 

3. National Institutes of Health. Manual for Statistical Presentations. Bethesda, MD. 

4. Office of Management and Budget. Office of Information and Regulatory Affairs. 1989. 
Information Collection Review Handbook. 

5. U.S. Department of Commerce. Federal Information Processing Standards (FIPS). 
Washington, D.C. 

6. U.S. Department of Commerce. Office of Federal Statistical Policy and Standards. 1978. 
Statistical Policy Handbook. Washington, D.C. 

The following table indicates which sets of standards may be referred to for additional information 
on topics addressed by each of the standards listed. 




Appendix B. Other Related Standards 





GAO, 
Developing and 

Using 
Questionnaires 

(1986) 


Joint 
Committee, 
Code of Fair 

Testing 
Practices in 
Education 


NIH, Manual 
for Statistical 
Presentations 


OMB, 
Information 
Collection 

Review 
Handbook 

(1989) 


U.S. Dept. of 
Commerce, 

Federal 
Information 
Processing 
Standards 
(TIPS) 


U.S. Dept. of j 
Commerce, { 
Statistical 1 

Policy 
Handbook 
(1978) 




J. 1, Creating an Infrastructure to 
Manage Data Collection 
Activities 














1.2. Justifying Data Collection 
Activities 














1.3. Fostering Commitment of All 
Participants 






m 




m 
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1.4. Creating an Appropriate 
Management Process 






m 




m 




2.1. Formulating and Refining 
Study Questions 














2.2. Choosing the Data Collection 
Methods 














2.3. Developing a Sampling Plan 














2.4. Assessing the Value of 
Obtainable Data 


■ 












2,5. Transforming Study Question 
Concepts into Measures 


■ 












2.6. Designing the Data Collection 
Instrument 


■ 














2.7. Minimizing Total Study bnor 
(Sampling and Nonsampling) 


■ 












2.8. Reviewing and Pretesting Data 
Collection Instruments, Forms, 
and Procedures 


■ 














2.9. Preparing a Written Design 
















3.1. Preparing for Data Collection 














3.2. Selecting and Training Data 
Collection Staff 
















3.3. Ethical treatment oi Uata 
Providers 
















3.4. Minimizing Burden and 
Non response 
















3.5. Implementing Data Collection 
Quality Control Procedures 
















3.6. Documenting Data Collections 














4.1. Planning Systems Requirements 
















4.2. Designing Data Processing 
Systems 
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GAO, 
Developing and 

thing 
Questionnaires 

(1986) 


Joint 
Committee, 
Code of Fair 

Testing 
Proetlees in 
Education 
(1988) 


NIH, Manual 
for Statistical 
Presentations 


OMB, 
Information 
Collection 

Review 
Handbook 

(1989) 


U.S. Dept. of 
Commerce, 

Federal 
Information 
Processing 

St fin sl/if/i p 
<Ml4ft I44.fr WO 

(HPS) 


U.S. Dept. of 
Commerce, 
Statistical 

Policy 
Handbook 
(1978) 


II 4.3. Developing Data Processing 
II Systems 


■ 












H 4.4. Testing Data Processing 
1 Systems 
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4.5. Planning for Data Preparation 














4.6. Preparing Data for Processing 
and Analysis 














4.7. Maintaining Programs arid 
Data Files 














4.8. Documenting Data Processing 
Activities 














4.9. Evaluating Data Processing 
Systems 














5.1. Preparing an Analysis Plan 














5.2. Developing Analysis Variables 
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5.3. Applying Appropriate Weights 














5.4. Estimating Sampling and 
Nonsampling Errors 














5.5. Determining Statistical 
Significance 






■ 








| 6.1. Presenting Findings 


. 


■ 










| 6.2. Reviewing the Report 
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| 6.3. Releasing Data 














| 6.4. Disseminating Data 














1 6.5. Preparing Documentation and 
| Technical Reports 















INDEX 

STANDARDS BY TOPIC OF INTEREST 



ER?C 



1-1 

130 



Standards for Education Data Collection and Reporting 



Index. Standards by Topic of Interest 



Throughout this document, standards are grouped within the six key phases of data collection and 
reporting activities. There are, however, a number of themes that run through all of the phases, such as 
developing sound management practices, minimizing burden on data providers, improving the quality and 
rate of responses, protecting the confidentiality of data providers, fully documenting data collection 
activities, and providing training for those involved in data collection activities. 

This index groups standards in different phases by key subject areas to provide a convenient 
reference for looking at important topics across multiple phases. 
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Index. Standards by Topic of interest 



Index. Standards by Topic of Interest 



Accuracy 



Standard 1.2. 
Standard 2.4. 
Standard 2.6. 
Standard 4.1. 



Justifying Data Collection Activities 
Assessing the Value of Obtainable Data 
Designing the Data Collection Instrument 
Planning System Requirements 



Burden 

Standard 1.2. 
Standard 2.6. 
Standard 3.2. 
Standard 3.3. 
Standard 3.4. 

Confidentiality 

Standard 3.2. 
Standard 3.3. 
Standard 3.4. 
Standard 3.5. 
Standard 3.6. 
Standard 4.1. 
Standard 6.1. 
Standard 6.3. 



Justifying Data Collection Activities 
Designing the Data Collection Instrument 
Selecting and Training Data Collection StaLf 
Ethical Treatment of Data Providers 
Minimizing Burden and Nonresponse 



Selecting and Training Data Collection Staff 

Ethical Treatment of Data Providers 

Minimizing Burden and Nonresponse 

Implementing Data Collection Quality Control Procedures 

Documenting Data Collections 

Planning Systems Requirements 

Presenting Findings 

Releasing Data 



Data Provider 

Standard 1.1. 
Standard 1.2. 
Standard 1.3. 
Standard 2.2. 
Standard 2.6. 
Standard 2.7. 
Standard 3.1. 
Standard 3.3. 
Standard 3.4. 
Standard 3.6. 
Standard 6.3. 



Creating an Infrastructure to Manage Data Collection Activities 

Justifying Data Collection Activities 

Fostering Commitment of All Participants 

Choosing the Data Collection Methods 

Designing the Data Collection Instrument 

Minimizing Total Study Error 

Preparing for Data Collection 

Ethical Treatment of Data Providers 

Minimizing Burden and Nonresponse 

Documenting Data Collections 

Releasing Data 
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Data Requestor 

Standard 1.1. 
Standard 1.2. 
Standard 1.3. 
Standard 3.1. 
Standard 4.1. 
Standard 5.1. 

Data Users 

Standard 1.1. 
Standard 1.2. 
Standard 1.3. 
Standard 4.1. 
Standard 5.1. 

Documentation 

Standard 2.9. 
Standard 3.6. 
Standard 4.8. 
Standard 5.1. 
Standard 6.5. 

Evaluation 

Standard 1.1. 
Standard 2.5. 
Standard 2.7. 
Standard 2.9. 
Standard 4.9. 

Information Need* 

Standard 1.1. 
Standard 1.2. 
Standard 1.3. 
Standard 2.1. 
Standard 2.2. 
Standard 2.6. 
Standard 4.1. 



Creating an Infrastructure to Manage Data Collection Activities 

Justifying Data Collection Activities 

Fostering Commitment of All Participants 

Preparing for Data Collection 

Planning System Requirements 

Preparing an Analysis Plan 



Creating an Infrastructure to Manage Data Collection Activities 
Justifying Data Collection Activities 
Fostering Commitment of All Participants- 
Planning System Requirements 
Preparing an Analysis Plan 



Preparing a Written Design 

Documenting Data Collections 

Documenting Data Processing Activities 

Preparing an Analysis Plan 

Preparing Documentation and Technical Reports 



Creating an Infrastructure ,o Manage Data Collection 
Transforming Study Question Concepts Into Measures 
Minimizing Total Study Error 
Preparing a Written Design 
Evaluation Data Processing Systeris 



Creating an Infrastructure to Manage Data Collection Activities 

Justifying Data Collection Activities 

Fostering Commitment of All Participants 

Formulating and Refining Study Questions 

Choosing the Data Collection Methods 

Designing the Data Collection Instrument 

Planning System Requirements 
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Index. Standards by Topic of Interest 



Instrument Development 



Standard 2.6. 
Standard 3.1. 
Standard 4.1. 



Designing the Data Collection Instrument 
Preparing for Data Collection 
Planning Systems Requirements 



Management 



Standard 1.1. 
Standard 1.2. 
Standard 2.9. 
Standard 3.1. 
Standard 3.2. 
Standard 3.5. 
Standard 4.1. 
Standard 5.1. 



Creating an Infrastructure to Manage Data Collection 

Justifying Data Collection Activities 

Preparing a Written Design 

Preparing for Data Collection 

Selecting and Training Data Collection Staff 

Implementing Data Collection Quality Control Procedures 

Planning Systems Requirements 

Preparing an Analysis Plan 



Planning 



Standard 1.3. 
Standard 2.7. 
Standard 3.1. 
Standard 4.1. 
Standard 4.5. 
Standard 4.8. 
Standard 5.1. 
Standard 6.2. 



Fostering Commitment of All Participants 
Minimizing Total Study Error 
Preparing for Data Collection 
Planning Systems Requirements 
Planning for Data Preparation 
Documenting Data Processing Activities 
Preparing an Analysis Plan 
Reviewing the Report 



Quality 



Standard 1.4. 
Standard 2.2. 
Standard 2.3. 
Standard 2.7. 
Standard 3.5. 
Standard 4 6. 
Standard 6.1. 
Standard 6.2. 



Creating an Appropriate Management Process 
Choosing the Data Collection Methods 
Developing a Sampling Plan 
Minimizing Total Study Error 

Implementing Data Collection Quality Control Procedures 
Preparing Data for Processing and Analysis 
Presenting Findings 
Reviewing the Report 



Response 



Standard 2.6. 
Standard 2.7. 
Standard 2.8. 

Standard 3.2. 
Standard 3.4. 



Designing the Data Collection Instrument 

Minimizing Total Study Error (Sampling and Nonsampiing) 

Reviewing and Pretesting Data Collection Instruments, Forms, and 

Procedures 

Selecting and Training Data Collection Staff 
Minimizing Burden and Nonresponse 
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Review and Approval 

Standard 1.1. 
Standard 2.6. 
Standard 2.7. 
Standard 2.8. 

Standard 2.9. 
Standard 3.1. 
Standard 6.2. 



Creating an Infrastructure to Manage Data Collection and Reporting 
Designing the Data Collection Instrument 
Minimizing Total Study Error 

Reviewing and Pretesting Data Collection Instruments, Forms, and 
Procedures 

Preparing a Written Design 
Preparing for Data Collection 
Reviewing the Report 



Sampling 



Standard 2.2. 
Standard 2.3. 
Standard 2.7. 
Standard 2.8. 

Standard 2.9. 
Standard 3.1. 
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abstracting - the process of converting information from existing records into a format required for 
analysis. 

alternative hypothesis - see hypothesis. 

analytic techniques - techniques which address questions about how or why units in the population are 
related (see descriptive techniques). 

archiving - a managed system of storing data files off line in a secure area for an extended period of time. 

audiences - persons and organizations who will be guided in making decisions by the results of the data 
collection activity and all others (data users) who are interested in the results of the data collection. 

base weight - the product of all component weights; accounts for all stages of sampling units. 

bias (due to nonresponse) - difference that occurs when respondents differ as a group from nonrespondents 
on a characteristic being studied. 

bias (of an estimate) - the difference between the expected value of a sample estimate and the 
corresponding true or target value for the population. 

burden - the aggregate hours realistically required for data providers to participate in a data collection 
activity. 

CAPI - Computer Assisted Personal Interviewing enables data collection staff to use portable 
microcomputers to administer a data collection form while viewing the form on the computer screen. As 
responses are entered directly into the computer, they are used to guide the interview and are automatically 
checked for specified range, format, and consistency edits. 

CATI - Computer Assisted Telephone Interviewing uses a computer system that allows a telephone 
interviewer to administer a data collection form over the phone while viewing the form on a computer 
screen. As the interviewer enters responses directly into the computer, the responses are used to guide 
the interview and are automatically checked for specified range, format, and consistency edits. 

CI) ROM - Compact Disc Read Only Memory is a computer storage disk in the same physical form as 
an audio CD. A CD ROM disk can store approximately 550 megabytes of digital data (e.g., about 1.5 
thousand pages). 

census - a count of all the elements of a population and a determination of the distributions of the 
characteristics. 

closed-ended question - a type of question in which the data provider's responses are limited to given 
alternatives (see open-ended question). 
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codebook - a record of each variable being measured, including variable name, columns occupied by each 
variable in the data file, and values used to define each variable. 

coding - the act of categorizing raw data into groups or giving the data numerical values. 

component weight - for each stage of sampling, the component weight is equai to the reciprocal of the 
probability of selecting the unit at that stage. 

Computer Assisted Personal Interviewing - see CAPI. 

Computer Assisted Telephone Interviewing - see CATI. 

confidence interval - a sample-based estimate expressed as an interval or range of values within which 
the true or target population value is expected to be located (with a specified level of confidence given 
as a percentage). 

consistency edits - see logic edits. 

construct - a concept that describes a characteristic, attribute, or variable relationship. The concepts are 
often unobservable ideas or abstractions such as community context or student educational performance. 

correlation - the tendency for certain values or levels of one variable to occur with particular values or 
levels of another variable. 

correlation coefficient - a measure of association between two variables that can range from -1.00 (perfect 
negative relationship) to 0 (no relationship) to +1.00 (perfect positive relationship). 

data collection activities - all phases of data collection and reporting. 

data dictionary - a database that holds the name, type, range of values, sources, and authorization for 
access for each data element in an organization's files and database. It may also indicate which 
application programs use that data so that when a change in data structure is contemplated, a list of the 
affected programs can be generated. 

data element - the most basic unit of information. In data processing, it is the fundamental data structure. 
It is defined by its size (in characters) and data type (e.g., alphanumeric, numeric only, true/false, date) 
and may include a specific set of values or range of values. 

data file backup - copies of the latest data files that can be used to restore lost data. 

data flow diagram - depicts the movement of data within a system by describing the data and the manual 
and machine processing performed on the data. 

data producer - agency or organization that carries out the actual study design, data collection, 
processing, analysis, and reporting. The term encompasses all members of the project staff including 
managers, data collectors, data processors, data analysts, and data reporters. 
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data provider - agency, organization, or individual who supplies data. 

data requestor - agency or organization that requests or sponsors a data collection and reporting activity. 

data users - agencies, organizations, or individuals who use the data developed by the data producer. 
Although the term data user may refer to the data requestor, it may also refer to other entities or 
individuals, including other agencies, individual researchers, the media, and members of the public, who 
utilize the results of a data collection activity in some way. 

design effect - the variance of an estimate divided by the variance of the estimate that would have 
occurred if a sample of the same size had been selected using simple random sampling. 

field test - the study of a data collection activity in the setting where it t!f to be conducted, 
file - a set of related records. 

file corruption - a file that has been physically damaged or that contains errors on the magnetic surface 
that prevent the file from being accessed by a computer. 

file design - the method used in a file to store and retrieve data. 

file recovery - copying the file from a current backup version. 

file regeneration - the process of running necessary software to rebuild a file. 

final adjusted weight - the product of all base-weight components, nonresponse adjustments, and the 
poststratification factor, if any. 

"hot deck" imputation - a process that replaces missing data items with values observed from other 
sampled cases. 

hypothesis - an assumption about a property or characteristic of a population. In statistical theory, there 
are usually two hypotheses under consideration, and the goal is to decide which of the two hypotheses is 
likely to be true.' The null hypothesis usually corresponds to the hypothesis to be tested by accepted 
statistical techniques. That is, the null hypothesis is considered to be true unless there is compelling 
evidence from the sample data that it is false. The alternative hypothesis is the complement of the null 
hypothesis. 

imputation for item or survey nonresponse - substituting plausible values for missing values in the 
survey data set. 

input - any data entered into a computer or the act of entering data by keyboard, light pen, mouse, 
graphics tablet, magnetic disk or tape, communications channel, or key-punch card. 

item nonresponse - items on a data collection form that are missing when a response was expected. 

linearity - a relationship in which, when any two variables are plotted, a straight line results. 
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logic edits - checks made of the data to ensure common sense consistency among the responses from a 
data provider. 

master file - a collection of records holding the current data in an information system. It contains 
descriptive data as well as summary information. 

modelling - a general class of statistical techniques in which the mathematical relationships between 
several variables are developed and analyzed. 

multiple comparison procedures - statistical procedures that take into account the fact that several 
statistical tests are being performed simultaneously. In particular, multiple comparison procedures give 
the proper type I errors, considering the number of comparisons being made. 

nonresponse - cases in a data collection activity in which potential data providers are contacted but refuse 
to reply or are unable to do so for reasons such as deafness or illness. 

nonresponse bias - occurs when respondents as a group differ from nonrespondents in their answers to 
items on a data collection form. 

nonsampling error - an error in sample estimates that cannot be attributed to sampling fluctuations. Such 
errors may arise from many sources including imperfect selection, bias in response or estimation, and 
errors of observation and recording. 

null hypothesis - see hypothesis. 

open-ended question - a type of interview item that does not limit the potential responses to 
predetermined alternatives (see closed-ended question). 

operational definition - the sequence of steps or procedures a researcher follows to obtain a measurement; 
specifies how to measure a variable. 

operationalize - to describe constructs or variables in concrete terms so that measurements can be made. 

optical disk - a disk that is read optically (i.e., by laser technology), rather than magnetically. 

out-of-range response - a response that is outside of the predetermined range of values considered 
acceptable for a particular item. 

outliers - an observation so far separated in value from other observations that it raises the question of 
whether it comes fror,. a different population or whether the sampling technique is at faut. 

output - any computer-generated information appearing on hard copy, video display, or machine readable 
form (e.g., disk or tape). 

pilot test - a brief and simplified preliminary study designed to test methods and to learn whether a 
proposed data collection activity appears likely to yield valid, reliable, and useful results. 
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population - all individuals in the group to which conclusions from a data collection activity are to be 
applied. 

population variance - a measure of dispersion defined as the average of the squared deviations between 
the observed values of the elements of a population or sample and the corresponding mean of those values. 

poststratification adjustment - a weight adjustment that forces study estimates to match independent 
population totals within selected poststrata (adjustment cells). 

precision - the difference between a sample-based estimate and its expected value. Precision is measured 
by the sampling error (or standard error) of an estimate. 

pretest - a test to determine performance prior to the administration of a data collection activity. 

probability sample - a sample selected by a method such that each unit has a fixed and determined 
probability of selection. 

processing - the manipulation of data. 

range check - a determination of whether responses fall within a predetermined set of acceptable values. 

record format - the layout of the information contained in a data record (includes the name, type, and 
size of each field in the record). 

records - a logical grouping of data elements within a file upon which a computer program acts. 

regression - a statistical technique in which the functional relationship between a dependent variable and 
one or more independent variables can be estimated. 

reliability - the consistency in results of an achievement test or measurement, including the tendency of 
the test or measurement to produce the same results when applied two times or more to some entity or 
attribute believed not to have changed in the interval between measurements. 

replication techniques - methods of estimating sampling errors that involve repeated estimation of the 
same statistic using various subsets of data providers. The two primary methods are balanced repeated 
replication (BRR) and the jackknife technique. 

respondent - the individual or agency who completes a survey (e.g., the person who marks the answers 
on a survey insUument or who provides answers verbaily to a data collector). 

sample - a subgroup selected from the entire population. 

sampling error - that part of the difference between a value for an entire population and an estimate of 
that value derived from a probability sample that results from observing only a sample of values. 

sampling strata - mutually exclusive and exhaustive subsets of the population within which elements of 
the population have similar characteristics, to the extent feasible. 
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sampling variance of estimates - the distribution of values of a statistic that would occur if the survey 
were repeated a large number of times using the same sample design, instrument, and data collection 
methodology. The square root of the sampling variance is the standard error (see standard error of an 
estimate). 

simple hypothesis - a statistical hypothesis that completely specifies the distribution function of the 
variates concerned in the hypothesis. 

special population - a subset of the total population distirguishable by unique nee/R characteristics, or 
interests (e.g., disadvantaged students, gifted students, j hysically or mentally handicapped students, 
vocational education students). 

standard deviation - the most widely used measure of dispersion of a frequency distribution. It is equal 
to the positive square root of the population variance. 

standard error of an estimate - the positive square root of the sampling variance. It is a measure of the 
sampling distribution of a statistic. Standard errors are used to establish confidence levels for the statistics 
being analyzed. 

statistically significant - there is a low probability that the result is attributable to chance alone. 

symmetry - refers to a property of a relationship in which no distinction is made between independent 
and dependent variables; "asymmetry" refers to a property of a relationship in which such a distinction 
is made. For example, the Pearson product-moment correlation coefficient is a symmetric measure of 
linear association between two intervally scaled variables while the linear regression coefficient is an 
asymmetric measure for such variables. 

system flowchart - shows the flow through components of the data system or the relationship of macro- 
components of the data system. 

system operator - a person who is responsible for the physical operation of the equipment (e.g., CPUs, 
disk drives, tape drives, printers). 

systems design stage - the second of four stages in the development of an information processing system, 
ii expands upon the systems requirements specified in the Systems Requirements Stage. 

systems development stage - the third of four stages in the development of a data processing system; 
entails writing computer programs that follow the systems design specifications, 

systems requirements stage - the first of four stages in the development of an information processing 
system, it defines the scope of the project. 

systems testing stage - the fourth of four stages in the development of an data processing system, it 
provides for the testing of all computer programs according to a test plan before the system is considered 
operational. 

t-test - any statistical test in which the underlying distribution of the test statistic has a student's 
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t-distribution. Most commonly, t-tests are used to compare the means of two subsets of a population 
under study. 

transaction file - a collection of records that record any change in a master file. The data in transaction 
files are used to update the data in master files. Transaction files also serve as audit trails and, after a 
period of time, are transferred from on-line disk to off-line tape for future statistical and histr rical 
processing. 

type I error - in a formal statistical test of a hypothesis, the probability of rejecting the null hypothesis 
when it is true. 

type II error - in a formal statistical test of a hypothesis, the probability of accepting the null hypothesis 
when it is false. 

unit nonresponse - failure of a survey respondent to provide a response to any of the study questions. 

validation sample - a sample drawn from a study so that the survey responses can be compared with 
values from another source that is assumed to contain the true or target values. 

validity - the capacity of a measuring instrument to predict what it was designed to predict; stated most 
often in terms of the correlation between values on the instrument and measures of performance on some 
criterion. 

verification - checking the accuracy of data collected. 

weighted estimates - estimates from a sample survey in which the sample data are weighted (multiplied) 
by factors reflecting the sample design. The weights (referred to as sampling weights) are typically equal 
to the reciprocals of the overall selection probabilities, multiplied by a nonresponse or poststratification 
adjustment. 
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